This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
4/7
LoopUnrollPass.cpp
-
test/Transforms/LoopUnroll/
-
Transforms/
-
LoopUnroll/
-
full-unroll-one-unpredictable-exit.ll
-
multiple-exits.ll
-
partial-unroll-non-latch-exit.ll
1/2
runtime-loop-known-exit.ll
-
scevunroll.ll
-
unloop.ll
-
unroll-header-exiting-with-phis-multiple-exiting-blocks.ll

Differential D102982

[LoopUnroll] Use smallest exact trip count from any exit
ClosedPublic

Authored by nikic on May 23 2021, 3:48 AM.

Download Raw Diff

Details

Reviewers

fhahn
reames
Meinersbur
mkazantsev

Commits

rG1ae266f4529f: [LoopUnroll] Use smallest exact trip count from any exit

Summary

This is a more general alternative/extension to D102635. Rather than handling the special case of "header exit with non-exiting latch", this unrolls against the smallest constant exact trip count from any (latch-dominating) exit, regardless of whether the latch is also exiting or not.

The motivating case is in full-unroll-one-unpredictable-exit.ll. Here the header exit is an IV-based exit, while the latch exit is a data comparison. This kind of loop does not get rotated, because the latch is already exiting, and loop rotation doesn't try to distinguish between IV-based/SCEV-able latches.

I believe that unrolling should be treating this kind of loop the same regardless of whether the IV-based comparison happens to be on the latch or some other block.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

nikic created this revision.May 23 2021, 3:48 AM

Herald added subscribers: javed.absar, zzheng, hiraditya. · View Herald TranscriptMay 23 2021, 3:48 AM

nikic requested review of this revision.May 23 2021, 3:48 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2021, 3:48 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B105796: Diff 347240.May 23 2021, 4:21 AM

Looks like the PreserveCondBr flag is currently not respected when partial unrolling is performed, and the latch still gets simplified in that case.

nikic mentioned this in rG15b108442fc8: [LoopUnroll] Add test for partial unrolling again non-latch exit (NFC).May 23 2021, 2:10 PM

nikic mentioned this in D103026: [LoopUnroll] Explicitly specify exit to unroll against (NFCI).May 24 2021, 7:13 AM

Rebase over D103026.

nikic added a parent revision: D103026: [LoopUnroll] Explicitly specify exit to unroll against (NFCI).May 24 2021, 8:40 AM

nikic edited the summary of this revision. (Show Details)May 24 2021, 8:43 AM

Harbormaster completed remote builds in B105924: Diff 347408.May 24 2021, 9:22 AM

reames added inline comments.May 24 2021, 2:50 PM

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
1133	I would strongly prefer this logic be sunk into the appropriate SCEV accessors. I'm fine with you doing that in a follow up provided you commit to doing so. I'll leave that decision up to you.

nikic added inline comments.May 24 2021, 3:31 PM

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
1133	Could you please clarify which SCEV accessors you have in mind here? Do you mean sinking this into the getSmallConstantTripCount() variant that only accepts a Loop? I would have a couple of concerns with doing that: We also need to know which exit the trip count refers to. This is not really the trip count of the loop (just a loop exit), and I'm pretty sure changing that would break other users of the API. The limitation to latch-dominating exits here is not fundamental, and mainly there due to unclear profitability. Maybe I misunderstood the suggestion though.

reames added inline comments.May 24 2021, 3:38 PM

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
1133	I was referring specifically to the small constant trip count and small constant multiple versions which take Loop parameters. Your point about needing the exit block is true for the way the code is currently phrased. I'd missed that. I think the need for that can and should be removed (see my comment on the patch this one is based on), but if that's logistically complicated, I'm fine with us moving forward with this structure and then revisiting in the future. Any exit count for an exit which dominates the latch must be a (potentially conservative) exit count for the loop. So, I'm not quite sure what you mean with the rest of your comments.

nikic added inline comments.May 25 2021, 12:50 AM

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
1133	Your point about needing the exit block is true for the way the code is currently phrased. I'd missed that. I think the need for that can and should be removed (see my comment on the patch this one is based on), but if that's logistically complicated, I'm fine with us moving forward with this structure and then revisiting in the future. Going to respond here to keep it in one place: I think what you're suggesting is effectively to use getSmallConstantMaxTripCount() and pass down that information only. However, the important distinction here is that the max trip count only tells you that branches after that trip count must be taken, not that branches before it cannot be taken (for that specific exit). Unrolling handles a number of different exits (exact, exact-or-zero, max, multiple-of) and this (unless I'm misunderstanding again) would only handle the max case, where we can only fold the final branch. Any exit count for an exit which dominates the latch must be a (potentially conservative) exit count for the loop. So, I'm not quite sure what you mean with the rest of your comments. The getSmallConstantTripCount() currently returns an exact trip count for a single-exit loop, while getSmallConstantMaxTripCount() returns a max trip count for any loop. As mentioned before, both provide different guarantees. I think to stick with the spirit of getSmallConstantTripCount(), it should only be returning a value if all the exit counts in a loop match, which doesn't seem terribly useful in practice.

@nikic We're talking past each other here.

Let me work up from definitions. I think this will make my point easier to follow.

An exact exit count for an loop exit is the iteration on which that loop exit *must* be taken. We know that no previous iteration can exit via that exit. A maximum exit count is any upper bound on an exact exit count. The only requirement is that it be a valid upper bound. In practice, we only consider constant upper bounds, but that's an implementation detail. See getExitCount(BB) and computeExitLimit for details. Note that we only compute exit counts for exits which dominate latch as we don't want to have to reason about exits being skipped on the potentially exiting iteration.

An exit count for a loop (either exact or max) is simply a umin of all the exit counts which dominate the latch. See getBackedgeTakenCount and computeBackedgeTakenCount.

A trip count is simply an exit (e.g. backedge taken) count plus one. The value 0 is used for "don't know". (We really should have used an option there instead, because the resulting code is confusing.)

A "small" exit count is simply one which evaluates to a small constant. A valid implementation of any of the "small constant trip count" methods is to compute the exact symbolic exit count, dyn_cast to a constant, and add one.

For historical reasons, the small constant trip count routines have never been updated to do that. Instead, they're still specialized for the case of a single exiting block. However, that simply means they produce "don't know" more than required.

My whole point has been - aside from the need to know which exit has the smallest trip count - that you could simply use the generic implementation which works for multiple exit loops. There's no need for SCEV to continue to use the form restricted to single exit loops.

For further clarity, here's the implementation I'm suggesting.
unsigned ScalarEvolution::getSmallConstantTripCount(const Loop *L) {

auto *ExitCount = dyn_cast<SCEVConstant>(getBackedgeTakenCount(L, Exact));
return getConstantTripCount(ExitCount);

}

(This passes make check btw.)

The rest of the patch would be removing the now-stale header comment, and reviewing callers to ensure that they can actually handle multiple exit loops to return a value other than "don't know" and if not, adding a guard before the call.

I just went ahead and finished the SCEV patch I was suggesting since I'd done most of it already.
See https://reviews.llvm.org/D103182

nikic added a comment.May 26 2021, 10:12 AM

This comment was removed by nikic.

@nikic Computing an exact exit count for a multiple exit loop using umin is correct, and is pretty much one of the key purposes of SCEV. I've tried to explain this multiple times, and we don't seem to be getting anywhere here. Can we *please* move this to spoke conversation?

@reames Sorry, I completely confused myself here. Ignore everything I said in my last comment!

@reames Thanks for the patient explanation and me being so dense. No idea what I was thinking here.

To loop back around to the unrolling use-case, D103182 would now provide us with an exact trip count for the loop. However, I think we still need to use the approach from this patch, or something similar to it, for two reasons:

Without the knowledge which exit the trip count is for, we can fold all exits before the TripCount to "not taken", but we don't know which exit is the taken one on the last iteration.
For trip multiples, the "loop trip multiple" would be the minimum of all trip multiples (please correct me if I'm wrong on that!) which I believe is not what is desired for unrolling. If we have one unpredictable exit and one multiple-4 exit, we'd want to unroll against that exit and save use the intermediate branches, but the "loop trip multiple" would be 1 in that case, as we can exit from the loop on every iteration.

In D102982#2782674, @nikic wrote:

@reames Thanks for the patient explanation and me being so dense. No idea what I was thinking here.

To loop back around to the unrolling use-case, D103182 would now provide us with an exact trip count for the loop. However, I think we still need to use the approach from this patch, or something similar to it, for two reasons:

Without the knowledge which exit the trip count is for, we can fold all exits before the TripCount to "not taken", but we don't know which exit is the taken one on the last iteration.

Maybe I'm now being dense, but *why* do we need to know which block was taken? I'm searching for uses of ExitingBlock in the code, and I can't find it used after the bit of code computing TripCount and TripMultiple.

For trip multiples, the "loop trip multiple" would be the minimum of all trip multiples (please correct me if I'm wrong on that!) which I believe is not what is desired for unrolling. If we have one unpredictable exit and one multiple-4 exit, we'd want to unroll against that exit and save use the intermediate branches, but the "loop trip multiple" would be 1 in that case, as we can exit from the loop on every iteration.

So, not minimum of all trip multiples, GCD of the same. See D103189.

With GCD, I think unrolling gets a huge boost for multiple exit loops. We can still go further, but we'd need to actual expose multiple multiples to the cost model. Simply "lying" about the multiple to the costing code seems highly suspect. (Though, looking at the existing cost modeling code, it looks like it already fudges the multiple. I have no idea what that code is doing.)

I would request you stage this. Start with the easy GCD case which is not going to break any assumptions made later in the code (as it is a trip multiple for the loop), then come back to it if you have a motivating example and we can complicate the costing.

reames mentioned this in rG9306bb638ff2: [SCEV] Generalize getSmallConstantTripCount(L) for multiple exit loops.May 26 2021, 11:18 AM

I just realized what I was confused about in the first place: getBackedgeTakenCount() requires that all exits have an exact exit count. If you have one unpredicate exit and one exit with an exact exit count, then you'll get back an unpredictable backedge taken count. Only if all exits have an exact exit count will the umin be taken. (This is the isComplete() condition in BackedgeTakenInfo::getExact().)

So yes, the exact loop trip count is well-defined for multi-exit loops, but is not available if there is at least one unpredictable exit. However, a loop with one unpredictable exit and one exact exit is exactly the kind of loop I'm interested in here. Loop unrolling occurs against the exact exit, leaving us with TripCount unrolled checks of the unpredictable exit. (This is already supported if the exact trip count is on the latch, while other exits are unpredictable, but not if it's on a non-latch exit.)

In D102982#2782737, @reames wrote:

In D102982#2782674, @nikic wrote:

@reames Thanks for the patient explanation and me being so dense. No idea what I was thinking here.

To loop back around to the unrolling use-case, D103182 would now provide us with an exact trip count for the loop. However, I think we still need to use the approach from this patch, or something similar to it, for two reasons:

Without the knowledge which exit the trip count is for, we can fold all exits before the TripCount to "not taken", but we don't know which exit is the taken one on the last iteration.

Maybe I'm now being dense, but *why* do we need to know which block was taken? I'm searching for uses of ExitingBlock in the code, and I can't find it used after the bit of code computing TripCount and TripMultiple.

You're right: If we actually have an exact loop trip count, then we don't. But see my comment above: Here, we actually don't have an exact loop trip count. We have an exact trip count for one exit, while other exits might be unpredictable, and as such cannot be folded at all.

For trip multiples, the "loop trip multiple" would be the minimum of all trip multiples (please correct me if I'm wrong on that!) which I believe is not what is desired for unrolling. If we have one unpredictable exit and one multiple-4 exit, we'd want to unroll against that exit and save use the intermediate branches, but the "loop trip multiple" would be 1 in that case, as we can exit from the loop on every iteration.

So, not minimum of all trip multiples, GCD of the same. See D103189.

With GCD, I think unrolling gets a huge boost for multiple exit loops. We can still go further, but we'd need to actual expose multiple multiples to the cost model. Simply "lying" about the multiple to the costing code seems highly suspect. (Though, looking at the existing cost modeling code, it looks like it already fudges the multiple. I have no idea what that code is doing.)

I would request you stage this. Start with the easy GCD case which is not going to break any assumptions made later in the code (as it is a trip multiple for the loop), then come back to it if you have a motivating example and we can complicate the costing.

I think I'm going to adjust this to not change the trip multiple handling at all in this patch, i.e. only use trip multiple for either a single exit or a latch exit, as before. We can deal with that case separately. I don't think we have good test coverage for this area right now.

In D102982#2782815, @nikic wrote:

I just realized what I was confused about in the first place: getBackedgeTakenCount() requires that all exits have an exact exit count. If you have one unpredicate exit and one exit with an exact exit count, then you'll get back an unpredictable backedge taken count. Only if all exits have an exact exit count will the umin be taken. (This is the isComplete() condition in BackedgeTakenInfo::getExact().)

I think you've returned to an earlier confusion. I don't remember if this was from this review, or another recent one.

What unrolling appears to want here is a *maximum* trip count. What it's actually computing is an *exact* trip count. The variable names are highly confusing. I had previously suggested you rename all the TripCount variables to MaxTripCount and use the appropriate means to query that upper bound.

Specifically:
SE.getTripCountFromExitCount(SE.getConstantMaxBackedgeTakenCount(L))

I repeat that suggestion.

Could you please mark it as "changes planned" until the underlying patch is ready or merged?

nikic planned changes to this revision.Jun 2 2021, 12:17 AM

Rebase, and only consider TripCount for all exits, leave TripMultiple alone for now to minimize impact.

Harbormaster completed remote builds in B110058: Diff 353181.Jun 19 2021, 1:47 AM

Would be LGTM, but I'd like to understand the called out test change first. Everything else looks as expected.

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
1124	Your comment here is wrong. The code is correct, but the comment isn't. :) If we unroll by the exact trip count of any exit, we're guaranteed to break the backedge. As such, there might be conditional exits left in earlier iterations, but there will be nothing in later iterations which is what your comment appears to say. It might also be worth stating explicitly that this is an upper bound on the actual trip count of the loop (since an earlier conditional exit we can't analyze might be taken), and draw the distinction with a maximum count (conservatism in analyzing each exit.) Separately, I really think we should be allowing max trip counts here, but that's a separate step.
llvm/test/Transforms/LoopUnroll/runtime-loop-known-exit.ll
9	This test change looks concerning. Have you explored why this is happening? On the surface, this looks like a bad interaction between runtime unrolling and finding a more precise trip count for full unrolling we may need to explore.

nikic added inline comments.Jun 19 2021, 11:46 AM

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
1124	What I really wanted to say here is that an unroll by this trip count eliminates all branches relating to one exit, but branches relating to other exits may have to be kept. This is opposed to the max trip count case where we're only guaranteed to break the backedge, but may not be able to remove any other branches. The unroll code already handles max trip count, but it's only used if no exact trip count is known, and is controlled by a target option (which is disabled on X86...)
llvm/test/Transforms/LoopUnroll/runtime-loop-known-exit.ll
9	When a trip count is known, we don't perform runtime unrolling and perform partial unrolling instead. In this case it doesn't happen because I specified `-unroll-runtime` but not `-unroll-allow-partial` so we get no unrolling. If we do allow partial unrolling, the result looks like this: https://gist.github.com/nikic/0541f7937f6db4867ef7fd4d7673a2b1 We don't perform the runtime unroll transform to enforce a certain trip multiple on the latch exit, and instead make use of the known trip count on the other exit to only check it once per iteration. I'd say the new result is better (same reduction in branches without the need to remainder loop), with the caveat that it's more aggressive, because it doesn't use the default runtime unroll count.

nikic added inline comments.Jun 19 2021, 12:29 PM

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
1124	I guess it's worth discussing the larger context here. It probably doesn't come as a surprise that the modelling is rather odd and doesn't seem particularly principled. https://github.com/llvm/llvm-project/blob/59d90fe817b5f1feae1a1406bd487e6552b9928d/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp#L835-L846 lays out the reason why this is behind a target option, which is that it will result in more branches, which may be problematic for constrained branch predicators. What this doesn't take into account is that even a full unroll may replicate branches, either for other exits or just control flow within the loop. What would make more sense to me is to have some kind of "branch penalty" that applies for each newly introduced branch -- this could be due to inner control flow, a remaining unpredictable exit, or an exit that only has an upper bound.

Response explained the test case interaction. LGTM w/a tweaked comment.

This revision is now accepted and ready to land.Jun 19 2021, 5:08 PM

Update comment.

Harbormaster completed remote builds in B110095: Diff 353233.Jun 20 2021, 7:50 AM

Comment much improved, thanks!

This revision was landed with ongoing or failed builds.Jun 20 2021, 12:01 PM

Closed by commit rG1ae266f4529f: [LoopUnroll] Use smallest exact trip count from any exit (authored by nikic). · Explain Why

This revision was automatically updated to reflect the committed changes.

nikic added a commit: rG1ae266f4529f: [LoopUnroll] Use smallest exact trip count from any exit.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

LoopUnrollPass.cpp

31 lines

test/

Transforms/

LoopUnroll/

full-unroll-one-unpredictable-exit.ll

26 lines

multiple-exits.ll

86 lines

partial-unroll-non-latch-exit.ll

52 lines

runtime-loop-known-exit.ll

71 lines

scevunroll.ll

130 lines

unloop.ll

2 lines

unroll-header-exiting-with-phis-multiple-exiting-blocks.ll

52 lines

Diff 353237

llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp

Show First 20 Lines • Show All 1,114 Lines • ▼ Show 20 Lines	static LoopUnrollResult tryToUnrollLoop(
if (OptForSize)		if (OptForSize)
UP.Threshold = std::max(UP.Threshold, LoopSize + 1);		UP.Threshold = std::max(UP.Threshold, LoopSize + 1);

if (NumInlineCandidates != 0) {		if (NumInlineCandidates != 0) {
LLVM_DEBUG(dbgs() << " Not unrolling loop with inlinable calls.\n");		LLVM_DEBUG(dbgs() << " Not unrolling loop with inlinable calls.\n");
return LoopUnrollResult::Unmodified;		return LoopUnrollResult::Unmodified;
}		}

// Find trip count and trip multiple if count is not available		// Find the smallest exact trip count for any exit. This is an upper bound
		// on the loop trip count, but an exit at an earlier iteration is still
		reamesUnsubmitted Not Done Reply Inline Actions Your comment here is wrong. The code is correct, but the comment isn't. :) If we unroll by the exact trip count of any exit, we're guaranteed to break the backedge. As such, there might be conditional exits left in earlier iterations, but there will be nothing in later iterations which is what your comment appears to say. It might also be worth stating explicitly that this is an upper bound on the actual trip count of the loop (since an earlier conditional exit we can't analyze might be taken), and draw the distinction with a maximum count (conservatism in analyzing each exit.) Separately, I really think we should be allowing max trip counts here, but that's a separate step. reames: Your comment here is wrong. The code is correct, but the comment isn't. :) If we unroll by…
		nikicAuthorUnsubmitted Done Reply Inline Actions What I really wanted to say here is that an unroll by this trip count eliminates all branches relating to one exit, but branches relating to other exits may have to be kept. This is opposed to the max trip count case where we're only guaranteed to break the backedge, but may not be able to remove any other branches. The unroll code already handles max trip count, but it's only used if no exact trip count is known, and is controlled by a target option (which is disabled on X86...) nikic: What I really wanted to say here is that an unroll by this trip count eliminates all branches…
		nikicAuthorUnsubmitted Done Reply Inline Actions I guess it's worth discussing the larger context here. It probably doesn't come as a surprise that the modelling is rather odd and doesn't seem particularly principled. https://github.com/llvm/llvm-project/blob/59d90fe817b5f1feae1a1406bd487e6552b9928d/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp#L835-L846 lays out the reason why this is behind a target option, which is that it will result in more branches, which may be problematic for constrained branch predicators. What this doesn't take into account is that even a full unroll may replicate branches, either for other exits or just control flow within the loop. What would make more sense to me is to have some kind of "branch penalty" that applies for each newly introduced branch -- this could be due to inner control flow, a remaining unpredictable exit, or an exit that only has an upper bound. nikic: I guess it's worth discussing the larger context here. It probably doesn't come as a surprise…
		// possible. An unroll by the smallest exact trip count guarantees that all
		// brnaches relating to at least one exit can be eliminated. This is unlike
		// the max trip count, which only guarantees that the backedge can be broken.
unsigned TripCount = 0;		unsigned TripCount = 0;
unsigned TripMultiple = 1;		unsigned TripMultiple = 1;
// If there are multiple exiting blocks but one of them is the latch, use the		SmallVector<BasicBlock *, 8> ExitingBlocks;
// latch for the trip count estimation. Otherwise insist on a single exiting		L->getExitingBlocks(ExitingBlocks);
// block for the trip count estimation.		for (BasicBlock *ExitingBlock : ExitingBlocks)
		if (unsigned TC = SE.getSmallConstantTripCount(L, ExitingBlock))
		reamesUnsubmitted Not Done Reply Inline Actions I would strongly prefer this logic be sunk into the appropriate SCEV accessors. I'm fine with you doing that in a follow up provided you commit to doing so. I'll leave that decision up to you. reames: I would strongly prefer this logic be sunk into the appropriate SCEV accessors. I'm fine with…
		nikicAuthorUnsubmitted Done Reply Inline Actions Could you please clarify which SCEV accessors you have in mind here? Do you mean sinking this into the getSmallConstantTripCount() variant that only accepts a Loop? I would have a couple of concerns with doing that: We also need to know which exit the trip count refers to. This is not really the trip count of the loop (just a loop exit), and I'm pretty sure changing that would break other users of the API. The limitation to latch-dominating exits here is not fundamental, and mainly there due to unclear profitability. Maybe I misunderstood the suggestion though. nikic: Could you please clarify which SCEV accessors you have in mind here? Do you mean sinking this…
		reamesUnsubmitted Not Done Reply Inline Actions I was referring specifically to the small constant trip count and small constant multiple versions which take Loop parameters. Your point about needing the exit block is true for the way the code is currently phrased. I'd missed that. I think the need for that can and should be removed (see my comment on the patch this one is based on), but if that's logistically complicated, I'm fine with us moving forward with this structure and then revisiting in the future. Any exit count for an exit which dominates the latch must be a (potentially conservative) exit count for the loop. So, I'm not quite sure what you mean with the rest of your comments. reames: I was referring specifically to the small constant trip count and small constant multiple…
		nikicAuthorUnsubmitted Done Reply Inline Actions Your point about needing the exit block is true for the way the code is currently phrased. I'd missed that. I think the need for that can and should be removed (see my comment on the patch this one is based on), but if that's logistically complicated, I'm fine with us moving forward with this structure and then revisiting in the future. Going to respond here to keep it in one place: I think what you're suggesting is effectively to use getSmallConstantMaxTripCount() and pass down that information only. However, the important distinction here is that the max trip count only tells you that branches after that trip count must be taken, not that branches before it cannot be taken (for that specific exit). Unrolling handles a number of different exits (exact, exact-or-zero, max, multiple-of) and this (unless I'm misunderstanding again) would only handle the max case, where we can only fold the final branch. Any exit count for an exit which dominates the latch must be a (potentially conservative) exit count for the loop. So, I'm not quite sure what you mean with the rest of your comments. The getSmallConstantTripCount() currently returns an exact trip count for a single-exit loop, while getSmallConstantMaxTripCount() returns a max trip count for any loop. As mentioned before, both provide different guarantees. I think to stick with the spirit of getSmallConstantTripCount(), it should only be returning a value if all the exit counts in a loop match, which doesn't seem terribly useful in practice. nikic: > Your point about needing the exit block is true for the way the code is currently phrased.
		if (!TripCount \|\| TC < TripCount)
		TripCount = TripMultiple = TC;

		if (!TripCount) {
		// If no exact trip count is known, determine the trip multiple of either
		// the loop latch or the single exiting block.
		// TODO: Relax for multiple exits.
BasicBlock *ExitingBlock = L->getLoopLatch();		BasicBlock *ExitingBlock = L->getLoopLatch();
if (!ExitingBlock \|\| !L->isLoopExiting(ExitingBlock))		if (!ExitingBlock \|\| !L->isLoopExiting(ExitingBlock))
ExitingBlock = L->getExitingBlock();		ExitingBlock = L->getExitingBlock();
if (ExitingBlock) {		if (ExitingBlock)
TripCount = SE.getSmallConstantTripCount(L, ExitingBlock);
TripMultiple = SE.getSmallConstantTripMultiple(L, ExitingBlock);		TripMultiple = SE.getSmallConstantTripMultiple(L, ExitingBlock);
}		}

// If the loop contains a convergent operation, the prelude we'd add		// If the loop contains a convergent operation, the prelude we'd add
// to do the first few instructions before we hit the unrolled loop		// to do the first few instructions before we hit the unrolled loop
// is unsafe -- it adds a control-flow dependency to the convergent		// is unsafe -- it adds a control-flow dependency to the convergent
// operation. Therefore restrict remainder loop (try unrolling without).		// operation. Therefore restrict remainder loop (try unrolling without).
//		//
// TODO: This is quite conservative. In practice, convergent_op()		// TODO: This is quite conservative. In practice, convergent_op()
▲ Show 20 Lines • Show All 387 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopUnroll/full-unroll-one-unpredictable-exit.ll

	Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[A1_1:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A1]], i64 0, i64 1			; CHECK-NEXT: [[A1_1:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A1]], i64 0, i64 1
	; CHECK-NEXT: store i64 -8661621401413125213, i64* [[A1_1]], align 8			; CHECK-NEXT: store i64 -8661621401413125213, i64* [[A1_1]], align 8
	; CHECK-NEXT: [[A2_0:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A2]], i64 0, i64 0			; CHECK-NEXT: [[A2_0:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A2]], i64 0, i64 0
	; CHECK-NEXT: store i64 -5015437470765251660, i64* [[A2_0]], align 8			; CHECK-NEXT: store i64 -5015437470765251660, i64* [[A2_0]], align 8
	; CHECK-NEXT: [[A2_1:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A2]], i64 0, i64 1			; CHECK-NEXT: [[A2_1:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A2]], i64 0, i64 1
	; CHECK-NEXT: store i64 -8661621401413125213, i64* [[A2_1]], align 8			; CHECK-NEXT: store i64 -8661621401413125213, i64* [[A2_1]], align 8
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ 0, [[START:%.]] ], [ [[IV_NEXT:%.]], [[LATCH:%.]] ]			; CHECK-NEXT: br label [[LATCH:%.*]]
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[IV]], 2
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[EXIT:%.*]], label [[LATCH]]
	; CHECK: latch:			; CHECK: latch:
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[GEP1:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A1]], i64 0, i64 0
	; CHECK-NEXT: [[GEP1:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A1]], i64 0, i64 [[IV]]			; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A2]], i64 0, i64 0
	; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A2]], i64 0, i64 [[IV]]
	; CHECK-NEXT: [[LOAD1:%.]] = load i64, i64 [[GEP1]], align 8			; CHECK-NEXT: [[LOAD1:%.]] = load i64, i64 [[GEP1]], align 8
	; CHECK-NEXT: [[LOAD2:%.]] = load i64, i64 [[GEP2]], align 8			; CHECK-NEXT: [[LOAD2:%.]] = load i64, i64 [[GEP2]], align 8
	; CHECK-NEXT: [[EXITCOND2:%.*]] = icmp eq i64 [[LOAD1]], [[LOAD2]]			; CHECK-NEXT: [[EXITCOND2:%.*]] = icmp eq i64 [[LOAD1]], [[LOAD2]]
	; CHECK-NEXT: br i1 [[EXITCOND2]], label [[LOOP]], label [[EXIT]]			; CHECK-NEXT: br i1 [[EXITCOND2]], label [[LOOP_1:%.]], label [[EXIT:%.]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[EXIT_VAL:%.*]] = phi i1 [ false, [[LATCH]] ], [ true, [[LOOP]] ]			; CHECK-NEXT: [[EXIT_VAL:%.]] = phi i1 [ false, [[LATCH]] ], [ false, [[LATCH_1:%.]] ], [ true, [[LOOP_2:%.]] ], [ false, [[LATCH_2:%.]] ]
	; CHECK-NEXT: ret i1 [[EXIT_VAL]]			; CHECK-NEXT: ret i1 [[EXIT_VAL]]
				; CHECK: loop.1:
				; CHECK-NEXT: br label [[LATCH_1]]
				; CHECK: latch.1:
				; CHECK-NEXT: [[GEP1_1:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A1]], i64 0, i64 1
				; CHECK-NEXT: [[GEP2_1:%.]] = getelementptr inbounds [2 x i64], [2 x i64] [[A2]], i64 0, i64 1
				; CHECK-NEXT: [[LOAD1_1:%.]] = load i64, i64 [[GEP1_1]], align 8
				; CHECK-NEXT: [[LOAD2_1:%.]] = load i64, i64 [[GEP2_1]], align 8
				; CHECK-NEXT: [[EXITCOND2_1:%.*]] = icmp eq i64 [[LOAD1_1]], [[LOAD2_1]]
				; CHECK-NEXT: br i1 [[EXITCOND2_1]], label [[LOOP_2]], label [[EXIT]]
				; CHECK: loop.2:
				; CHECK-NEXT: br i1 true, label [[EXIT]], label [[LATCH_2]]
				; CHECK: latch.2:
				; CHECK-NEXT: br label [[EXIT]]
	;			;
	start:			start:
	%a1 = alloca [2 x i64], align 8			%a1 = alloca [2 x i64], align 8
	%a2 = alloca [2 x i64], align 8			%a2 = alloca [2 x i64], align 8
	%a1.0 = getelementptr inbounds [2 x i64], [2 x i64]* %a1, i64 0, i64 0			%a1.0 = getelementptr inbounds [2 x i64], [2 x i64]* %a1, i64 0, i64 0
	store i64 -5015437470765251660, i64* %a1.0, align 8			store i64 -5015437470765251660, i64* %a1.0, align 8
	%a1.1 = getelementptr inbounds [2 x i64], [2 x i64]* %a1, i64 0, i64 1			%a1.1 = getelementptr inbounds [2 x i64], [2 x i64]* %a1, i64 0, i64 1
	store i64 -8661621401413125213, i64* %a1.1, align 8			store i64 -8661621401413125213, i64* %a1.1, align 8
	Show All 24 Lines

llvm/test/Transforms/LoopUnroll/multiple-exits.ll

	Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines
	latch:			latch:
	call void @bar()			call void @bar()
	%cmp2 = icmp ult i64 %iv, 20			%cmp2 = icmp ult i64 %iv, 20
	br i1 %cmp2, label %loop, label %exit			br i1 %cmp2, label %loop, label %exit
	exit:			exit:
	ret void			ret void
	}			}

	; TODO: We should fully unroll this by 10, leave the unrolled latch			; Fully unroll this loop by 10, but leave the unrolled latch
	; tests since we don't know if %N < 10, and break the backedge.			; tests since we don't know if %N < 10, and break the backedge.
	define void @test2(i64 %N) {			define void @test2(i64 %N) {
	; CHECK-LABEL: @test2(			; CHECK-LABEL: @test2(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.]], [[LATCH:%.]] ]
	; CHECK-NEXT: [[IV_NEXT]] = add i64 [[IV]], 1
	; CHECK-NEXT: call void @bar()			; CHECK-NEXT: call void @bar()
	; CHECK-NEXT: [[CMP1:%.*]] = icmp ule i64 [[IV]], 10			; CHECK-NEXT: br label [[LATCH:%.*]]
	; CHECK-NEXT: br i1 [[CMP1]], label [[LATCH]], label [[EXIT:%.*]]
	; CHECK: latch:			; CHECK: latch:
	; CHECK-NEXT: call void @bar()			; CHECK-NEXT: call void @bar()
	; CHECK-NEXT: [[CMP2:%.]] = icmp ule i64 [[IV]], [[N:%.]]			; CHECK-NEXT: br i1 true, label [[LOOP_1:%.]], label [[EXIT:%.]]
	; CHECK-NEXT: br i1 [[CMP2]], label [[LOOP]], label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
				; CHECK: loop.1:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br label [[LATCH_1:%.*]]
				; CHECK: latch.1:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: [[CMP2_1:%.]] = icmp ule i64 1, [[N:%.]]
				; CHECK-NEXT: br i1 [[CMP2_1]], label [[LOOP_2:%.*]], label [[EXIT]]
				; CHECK: loop.2:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br label [[LATCH_2:%.*]]
				; CHECK: latch.2:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: [[CMP2_2:%.*]] = icmp ule i64 2, [[N]]
				; CHECK-NEXT: br i1 [[CMP2_2]], label [[LOOP_3:%.*]], label [[EXIT]]
				; CHECK: loop.3:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br label [[LATCH_3:%.*]]
				; CHECK: latch.3:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: [[CMP2_3:%.*]] = icmp ule i64 3, [[N]]
				; CHECK-NEXT: br i1 [[CMP2_3]], label [[LOOP_4:%.*]], label [[EXIT]]
				; CHECK: loop.4:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br label [[LATCH_4:%.*]]
				; CHECK: latch.4:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: [[CMP2_4:%.*]] = icmp ule i64 4, [[N]]
				; CHECK-NEXT: br i1 [[CMP2_4]], label [[LOOP_5:%.*]], label [[EXIT]]
				; CHECK: loop.5:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br label [[LATCH_5:%.*]]
				; CHECK: latch.5:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: [[CMP2_5:%.*]] = icmp ule i64 5, [[N]]
				; CHECK-NEXT: br i1 [[CMP2_5]], label [[LOOP_6:%.*]], label [[EXIT]]
				; CHECK: loop.6:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br label [[LATCH_6:%.*]]
				; CHECK: latch.6:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: [[CMP2_6:%.*]] = icmp ule i64 6, [[N]]
				; CHECK-NEXT: br i1 [[CMP2_6]], label [[LOOP_7:%.*]], label [[EXIT]]
				; CHECK: loop.7:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br label [[LATCH_7:%.*]]
				; CHECK: latch.7:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: [[CMP2_7:%.*]] = icmp ule i64 7, [[N]]
				; CHECK-NEXT: br i1 [[CMP2_7]], label [[LOOP_8:%.*]], label [[EXIT]]
				; CHECK: loop.8:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br label [[LATCH_8:%.*]]
				; CHECK: latch.8:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: [[CMP2_8:%.*]] = icmp ule i64 8, [[N]]
				; CHECK-NEXT: br i1 [[CMP2_8]], label [[LOOP_9:%.*]], label [[EXIT]]
				; CHECK: loop.9:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br label [[LATCH_9:%.*]]
				; CHECK: latch.9:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: [[CMP2_9:%.*]] = icmp ule i64 9, [[N]]
				; CHECK-NEXT: br i1 [[CMP2_9]], label [[LOOP_10:%.*]], label [[EXIT]]
				; CHECK: loop.10:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br label [[LATCH_10:%.*]]
				; CHECK: latch.10:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: [[CMP2_10:%.*]] = icmp ule i64 10, [[N]]
				; CHECK-NEXT: br i1 [[CMP2_10]], label [[LOOP_11:%.*]], label [[EXIT]]
				; CHECK: loop.11:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br i1 false, label [[LATCH_11:%.*]], label [[EXIT]]
				; CHECK: latch.11:
				; CHECK-NEXT: call void @bar()
				; CHECK-NEXT: br label [[EXIT]]
	;			;
	entry:			entry:
	br label %loop			br label %loop
	loop:			loop:
	%iv = phi i64 [0, %entry], [%iv.next, %latch]			%iv = phi i64 [0, %entry], [%iv.next, %latch]
	%iv.next = add i64 %iv, 1			%iv.next = add i64 %iv, 1
	call void @bar()			call void @bar()
	%cmp1 = icmp ule i64 %iv, 10			%cmp1 = icmp ule i64 %iv, 10
	▲ Show 20 Lines • Show All 47 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopUnroll/partial-unroll-non-latch-exit.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -S -loop-unroll -unroll-allow-partial %s \| FileCheck %s			; RUN: opt -S -loop-unroll -unroll-allow-partial %s \| FileCheck %s

	; This is a variant on full-unroll-non-latch-exit.ll for partial unrolling.			; This is a variant on full-unroll-non-latch-exit.ll for partial unrolling.
	; This test is primarily interested in making sure that latches are not			; This test is primarily interested in making sure that latches are not
	; folded incorrectly, not that a transform occurs.			; folded incorrectly, not that a transform occurs.

	define i1 @test(i64* %a1, i64* %a2) {			define i1 @test(i64* %a1, i64* %a2) {
	; CHECK-LABEL: @test(			; CHECK-LABEL: @test(
	; CHECK-NEXT: start:			; CHECK-NEXT: start:
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ 0, [[START:%.]] ], [ [[IV_NEXT:%.]], [[LATCH:%.]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ 0, [[START:%.]] ], [ [[IV_NEXT_4:%.]], [[LATCH_4:%.]] ]
	; CHECK-NEXT: [[EXITCOND:%.*]] = icmp eq i64 [[IV]], 24			; CHECK-NEXT: br label [[LATCH:%.*]]
	; CHECK-NEXT: br i1 [[EXITCOND]], label [[EXIT:%.*]], label [[LATCH]]
	; CHECK: latch:			; CHECK: latch:
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT:%.*]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[GEP1:%.]] = getelementptr inbounds i64, i64 [[A1:%.*]], i64 [[IV]]			; CHECK-NEXT: [[GEP1:%.]] = getelementptr inbounds i64, i64 [[A1:%.*]], i64 [[IV]]
	; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds i64, i64 [[A2:%.*]], i64 [[IV]]			; CHECK-NEXT: [[GEP2:%.]] = getelementptr inbounds i64, i64 [[A2:%.*]], i64 [[IV]]
	; CHECK-NEXT: [[LOAD1:%.]] = load i64, i64 [[GEP1]], align 8			; CHECK-NEXT: [[LOAD1:%.]] = load i64, i64 [[GEP1]], align 8
	; CHECK-NEXT: [[LOAD2:%.]] = load i64, i64 [[GEP2]], align 8			; CHECK-NEXT: [[LOAD2:%.]] = load i64, i64 [[GEP2]], align 8
	; CHECK-NEXT: [[EXITCOND2:%.*]] = icmp eq i64 [[LOAD1]], [[LOAD2]]			; CHECK-NEXT: [[EXITCOND2:%.*]] = icmp eq i64 [[LOAD1]], [[LOAD2]]
	; CHECK-NEXT: br i1 [[EXITCOND2]], label [[LOOP]], label [[EXIT]]			; CHECK-NEXT: br i1 [[EXITCOND2]], label [[LOOP_1:%.]], label [[EXIT:%.]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[EXIT_VAL:%.*]] = phi i1 [ false, [[LATCH]] ], [ true, [[LOOP]] ]			; CHECK-NEXT: [[EXIT_VAL:%.]] = phi i1 [ false, [[LATCH]] ], [ false, [[LATCH_1:%.]] ], [ false, [[LATCH_2:%.]] ], [ false, [[LATCH_3:%.]] ], [ true, [[LOOP_4:%.*]] ], [ false, [[LATCH_4]] ]
	; CHECK-NEXT: ret i1 [[EXIT_VAL]]			; CHECK-NEXT: ret i1 [[EXIT_VAL]]
				; CHECK: loop.1:
				; CHECK-NEXT: br label [[LATCH_1]]
				; CHECK: latch.1:
				; CHECK-NEXT: [[IV_NEXT_1:%.*]] = add nuw nsw i64 [[IV_NEXT]], 1
				; CHECK-NEXT: [[GEP1_1:%.]] = getelementptr inbounds i64, i64 [[A1]], i64 [[IV_NEXT]]
				; CHECK-NEXT: [[GEP2_1:%.]] = getelementptr inbounds i64, i64 [[A2]], i64 [[IV_NEXT]]
				; CHECK-NEXT: [[LOAD1_1:%.]] = load i64, i64 [[GEP1_1]], align 8
				; CHECK-NEXT: [[LOAD2_1:%.]] = load i64, i64 [[GEP2_1]], align 8
				; CHECK-NEXT: [[EXITCOND2_1:%.*]] = icmp eq i64 [[LOAD1_1]], [[LOAD2_1]]
				; CHECK-NEXT: br i1 [[EXITCOND2_1]], label [[LOOP_2:%.*]], label [[EXIT]]
				; CHECK: loop.2:
				; CHECK-NEXT: br label [[LATCH_2]]
				; CHECK: latch.2:
				; CHECK-NEXT: [[IV_NEXT_2:%.*]] = add nuw nsw i64 [[IV_NEXT_1]], 1
				; CHECK-NEXT: [[GEP1_2:%.]] = getelementptr inbounds i64, i64 [[A1]], i64 [[IV_NEXT_1]]
				; CHECK-NEXT: [[GEP2_2:%.]] = getelementptr inbounds i64, i64 [[A2]], i64 [[IV_NEXT_1]]
				; CHECK-NEXT: [[LOAD1_2:%.]] = load i64, i64 [[GEP1_2]], align 8
				; CHECK-NEXT: [[LOAD2_2:%.]] = load i64, i64 [[GEP2_2]], align 8
				; CHECK-NEXT: [[EXITCOND2_2:%.*]] = icmp eq i64 [[LOAD1_2]], [[LOAD2_2]]
				; CHECK-NEXT: br i1 [[EXITCOND2_2]], label [[LOOP_3:%.*]], label [[EXIT]]
				; CHECK: loop.3:
				; CHECK-NEXT: br label [[LATCH_3]]
				; CHECK: latch.3:
				; CHECK-NEXT: [[IV_NEXT_3:%.*]] = add nuw nsw i64 [[IV_NEXT_2]], 1
				; CHECK-NEXT: [[GEP1_3:%.]] = getelementptr inbounds i64, i64 [[A1]], i64 [[IV_NEXT_2]]
				; CHECK-NEXT: [[GEP2_3:%.]] = getelementptr inbounds i64, i64 [[A2]], i64 [[IV_NEXT_2]]
				; CHECK-NEXT: [[LOAD1_3:%.]] = load i64, i64 [[GEP1_3]], align 8
				; CHECK-NEXT: [[LOAD2_3:%.]] = load i64, i64 [[GEP2_3]], align 8
				; CHECK-NEXT: [[EXITCOND2_3:%.*]] = icmp eq i64 [[LOAD1_3]], [[LOAD2_3]]
				; CHECK-NEXT: br i1 [[EXITCOND2_3]], label [[LOOP_4]], label [[EXIT]]
				; CHECK: loop.4:
				; CHECK-NEXT: [[EXITCOND_4:%.*]] = icmp eq i64 [[IV_NEXT_3]], 24
				; CHECK-NEXT: br i1 [[EXITCOND_4]], label [[EXIT]], label [[LATCH_4]]
				; CHECK: latch.4:
				; CHECK-NEXT: [[IV_NEXT_4]] = add nuw nsw i64 [[IV_NEXT_3]], 1
				; CHECK-NEXT: [[GEP1_4:%.]] = getelementptr inbounds i64, i64 [[A1]], i64 [[IV_NEXT_3]]
				; CHECK-NEXT: [[GEP2_4:%.]] = getelementptr inbounds i64, i64 [[A2]], i64 [[IV_NEXT_3]]
				; CHECK-NEXT: [[LOAD1_4:%.]] = load i64, i64 [[GEP1_4]], align 8
				; CHECK-NEXT: [[LOAD2_4:%.]] = load i64, i64 [[GEP2_4]], align 8
				; CHECK-NEXT: [[EXITCOND2_4:%.*]] = icmp eq i64 [[LOAD1_4]], [[LOAD2_4]]
				; CHECK-NEXT: br i1 [[EXITCOND2_4]], label [[LOOP]], label [[EXIT]]
	;			;
	start:			start:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i64 [ 0, %start ], [ %iv.next, %latch ]			%iv = phi i64 [ 0, %start ], [ %iv.next, %latch ]
	%exitcond = icmp eq i64 %iv, 24			%exitcond = icmp eq i64 %iv, 24
	br i1 %exitcond, label %exit, label %latch			br i1 %exitcond, label %exit, label %latch
	Show All 14 Lines

llvm/test/Transforms/LoopUnroll/runtime-loop-known-exit.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -S -loop-unroll -unroll-runtime -unroll-runtime-multi-exit < %s \| FileCheck %s			; RUN: opt -S -loop-unroll -unroll-runtime -unroll-runtime-multi-exit < %s \| FileCheck %s

	; This loop has a known trip count on the non-latch exit. When performing			; This loop has a known trip count on the non-latch exit. When performing
	; runtime unrolling (at least when using a prologue rather than epilogue) we			; runtime unrolling (at least when using a prologue rather than epilogue) we
	; should not fold that exit based on known trip count information prior to			; should not fold that exit based on known trip count information prior to
	; prologue insertion, as that may change the trip count for the modified loop.			; prologue insertion, as that may change the trip count for the modified loop.

	define void @test(i32 %s, i32 %n) {			define void @test(i32 %s, i32 %n) {
	reamesUnsubmitted Not Done Reply Inline Actions This test change looks concerning. Have you explored why this is happening? On the surface, this looks like a bad interaction between runtime unrolling and finding a more precise trip count for full unrolling we may need to explore. reames: This test change looks concerning. Have you explored why this is happening? On the surface…
	nikicAuthorUnsubmitted Done Reply Inline Actions When a trip count is known, we don't perform runtime unrolling and perform partial unrolling instead. In this case it doesn't happen because I specified `-unroll-runtime` but not `-unroll-allow-partial` so we get no unrolling. If we do allow partial unrolling, the result looks like this: https://gist.github.com/nikic/0541f7937f6db4867ef7fd4d7673a2b1 We don't perform the runtime unroll transform to enforce a certain trip multiple on the latch exit, and instead make use of the known trip count on the other exit to only check it once per iteration. I'd say the new result is better (same reduction in branches without the need to remainder loop), with the caveat that it's more aggressive, because it doesn't use the default runtime unroll count. nikic: When a trip count is known, we don't perform runtime unrolling and perform partial unrolling…
	; CHECK-LABEL: @test(			; CHECK-LABEL: @test(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[N2:%.]] = add i32 [[S:%.]], 123			; CHECK-NEXT: [[N2:%.]] = add i32 [[S:%.]], 123
	; CHECK-NEXT: [[TMP0:%.]] = add i32 [[N:%.]], 1
	; CHECK-NEXT: [[TMP1:%.*]] = sub i32 [[TMP0]], [[S]]
	; CHECK-NEXT: [[TMP2:%.*]] = sub i32 [[N]], [[S]]
	; CHECK-NEXT: [[XTRAITER:%.*]] = and i32 [[TMP1]], 7
	; CHECK-NEXT: [[LCMP_MOD:%.*]] = icmp ne i32 [[XTRAITER]], 0
	; CHECK-NEXT: br i1 [[LCMP_MOD]], label [[LOOP_PROL_PREHEADER:%.]], label [[LOOP_PROL_LOOPEXIT:%.]]
	; CHECK: loop.prol.preheader:
	; CHECK-NEXT: br label [[LOOP_PROL:%.*]]
	; CHECK: loop.prol:
	; CHECK-NEXT: [[I_PROL:%.]] = phi i32 [ [[S]], [[LOOP_PROL_PREHEADER]] ], [ [[I_INC_PROL:%.]], [[LATCH_PROL:%.*]] ]
	; CHECK-NEXT: [[PROL_ITER:%.]] = phi i32 [ [[XTRAITER]], [[LOOP_PROL_PREHEADER]] ], [ [[PROL_ITER_SUB:%.]], [[LATCH_PROL]] ]
	; CHECK-NEXT: [[C1_PROL:%.*]] = icmp eq i32 [[I_PROL]], [[N2]]
	; CHECK-NEXT: br i1 [[C1_PROL]], label [[EXIT1_LOOPEXIT1:%.*]], label [[LATCH_PROL]]
	; CHECK: latch.prol:
	; CHECK-NEXT: [[C2_PROL:%.*]] = icmp eq i32 [[I_PROL]], [[N]]
	; CHECK-NEXT: [[I_INC_PROL]] = add i32 [[I_PROL]], 1
	; CHECK-NEXT: [[PROL_ITER_SUB]] = sub i32 [[PROL_ITER]], 1
	; CHECK-NEXT: [[PROL_ITER_CMP:%.*]] = icmp ne i32 [[PROL_ITER_SUB]], 0
	; CHECK-NEXT: br i1 [[PROL_ITER_CMP]], label [[LOOP_PROL]], label [[LOOP_PROL_LOOPEXIT_UNR_LCSSA:%.*]], !llvm.loop [[LOOP0:![0-9]+]]
	; CHECK: loop.prol.loopexit.unr-lcssa:
	; CHECK-NEXT: [[I_UNR_PH:%.*]] = phi i32 [ [[I_INC_PROL]], [[LATCH_PROL]] ]
	; CHECK-NEXT: br label [[LOOP_PROL_LOOPEXIT]]
	; CHECK: loop.prol.loopexit:
	; CHECK-NEXT: [[I_UNR:%.]] = phi i32 [ [[S]], [[ENTRY:%.]] ], [ [[I_UNR_PH]], [[LOOP_PROL_LOOPEXIT_UNR_LCSSA]] ]
	; CHECK-NEXT: [[TMP3:%.*]] = icmp ult i32 [[TMP2]], 7
	; CHECK-NEXT: br i1 [[TMP3]], label [[EXIT2:%.]], label [[ENTRY_NEW:%.]]
	; CHECK: entry.new:
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[I:%.]] = phi i32 [ [[I_UNR]], [[ENTRY_NEW]] ], [ [[I_INC_7:%.]], [[LATCH_7:%.*]] ]			; CHECK-NEXT: [[I:%.]] = phi i32 [ [[S]], [[ENTRY:%.]] ], [ [[I_INC:%.]], [[LATCH:%.]] ]
	; CHECK-NEXT: [[C1:%.*]] = icmp eq i32 [[I]], [[N2]]			; CHECK-NEXT: [[C1:%.*]] = icmp eq i32 [[I]], [[N2]]
	; CHECK-NEXT: br i1 [[C1]], label [[EXIT1_LOOPEXIT:%.]], label [[LATCH:%.]]			; CHECK-NEXT: br i1 [[C1]], label [[EXIT1:%.*]], label [[LATCH]]
	; CHECK: latch:			; CHECK: latch:
	; CHECK-NEXT: [[I_INC:%.*]] = add i32 [[I]], 1			; CHECK-NEXT: [[C2:%.]] = icmp eq i32 [[I]], [[N:%.]]
	; CHECK-NEXT: [[C1_1:%.*]] = icmp eq i32 [[I_INC]], [[N2]]			; CHECK-NEXT: [[I_INC]] = add i32 [[I]], 1
	; CHECK-NEXT: br i1 [[C1_1]], label [[EXIT1_LOOPEXIT]], label [[LATCH_1:%.*]]			; CHECK-NEXT: br i1 [[C2]], label [[EXIT2:%.*]], label [[LOOP]]
	; CHECK: exit1.loopexit:
	; CHECK-NEXT: br label [[EXIT1:%.*]]
	; CHECK: exit1.loopexit1:
	; CHECK-NEXT: br label [[EXIT1]]
	; CHECK: exit1:			; CHECK: exit1:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	; CHECK: exit2.unr-lcssa:
	; CHECK-NEXT: br label [[EXIT2]]
	; CHECK: exit2:			; CHECK: exit2:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	; CHECK: latch.1:
	; CHECK-NEXT: [[I_INC_1:%.*]] = add i32 [[I_INC]], 1
	; CHECK-NEXT: [[C1_2:%.*]] = icmp eq i32 [[I_INC_1]], [[N2]]
	; CHECK-NEXT: br i1 [[C1_2]], label [[EXIT1_LOOPEXIT]], label [[LATCH_2:%.*]]
	; CHECK: latch.2:
	; CHECK-NEXT: [[I_INC_2:%.*]] = add i32 [[I_INC_1]], 1
	; CHECK-NEXT: [[C1_3:%.*]] = icmp eq i32 [[I_INC_2]], [[N2]]
	; CHECK-NEXT: br i1 [[C1_3]], label [[EXIT1_LOOPEXIT]], label [[LATCH_3:%.*]]
	; CHECK: latch.3:
	; CHECK-NEXT: [[I_INC_3:%.*]] = add i32 [[I_INC_2]], 1
	; CHECK-NEXT: [[C1_4:%.*]] = icmp eq i32 [[I_INC_3]], [[N2]]
	; CHECK-NEXT: br i1 [[C1_4]], label [[EXIT1_LOOPEXIT]], label [[LATCH_4:%.*]]
	; CHECK: latch.4:
	; CHECK-NEXT: [[I_INC_4:%.*]] = add i32 [[I_INC_3]], 1
	; CHECK-NEXT: [[C1_5:%.*]] = icmp eq i32 [[I_INC_4]], [[N2]]
	; CHECK-NEXT: br i1 [[C1_5]], label [[EXIT1_LOOPEXIT]], label [[LATCH_5:%.*]]
	; CHECK: latch.5:
	; CHECK-NEXT: [[I_INC_5:%.*]] = add i32 [[I_INC_4]], 1
	; CHECK-NEXT: [[C1_6:%.*]] = icmp eq i32 [[I_INC_5]], [[N2]]
	; CHECK-NEXT: br i1 [[C1_6]], label [[EXIT1_LOOPEXIT]], label [[LATCH_6:%.*]]
	; CHECK: latch.6:
	; CHECK-NEXT: [[I_INC_6:%.*]] = add i32 [[I_INC_5]], 1
	; CHECK-NEXT: [[C1_7:%.*]] = icmp eq i32 [[I_INC_6]], [[N2]]
	; CHECK-NEXT: br i1 [[C1_7]], label [[EXIT1_LOOPEXIT]], label [[LATCH_7]]
	; CHECK: latch.7:
	; CHECK-NEXT: [[C2_7:%.*]] = icmp eq i32 [[I_INC_6]], [[N]]
	; CHECK-NEXT: [[I_INC_7]] = add i32 [[I_INC_6]], 1
	; CHECK-NEXT: br i1 [[C2_7]], label [[EXIT2_UNR_LCSSA:%.*]], label [[LOOP]]
	;			;
	entry:			entry:
	%n2 = add i32 %s, 123			%n2 = add i32 %s, 123
	br label %loop			br label %loop

	loop:			loop:
	%i = phi i32 [ %s, %entry], [ %i.inc, %latch ]			%i = phi i32 [ %s, %entry], [ %i.inc, %latch ]
	%c1 = icmp eq i32 %i, %n2			%c1 = icmp eq i32 %i, %n2
	Show All 13 Lines

llvm/test/Transforms/LoopUnroll/scevunroll.ll

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	while.body:
br i1 %cmp.i65, label %while.body, label %exit		br i1 %cmp.i65, label %while.body, label %exit

exit:		exit:
ret i32 %sum		ret i32 %sum
}		}

; SCEV unrolling properly handles loops with multiple exits. In this		; SCEV unrolling properly handles loops with multiple exits. In this
; case, the computed trip count based on a canonical IV is not for a		; case, the computed trip count based on a canonical IV is not for a
; latch block. Canonical unrolling incorrectly unrolls it, but SCEV		; latch block.
; unrolling does not.
define i64 @earlyLoopTest(i64* %base) nounwind {		define i64 @earlyLoopTest(i64* %base) nounwind {
; CHECK-LABEL: @earlyLoopTest(		; CHECK-LABEL: @earlyLoopTest(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: br label [[LOOP:%.*]]		; CHECK-NEXT: br label [[LOOP:%.*]]
; CHECK: loop:		; CHECK: loop:
; CHECK-NEXT: [[IV:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[INC:%.]], [[TAIL:%.]] ]		; CHECK-NEXT: [[VAL:%.]] = load i64, i64 [[BASE:%.*]], align 4
; CHECK-NEXT: [[S:%.]] = phi i64 [ 0, [[ENTRY]] ], [ [[S_NEXT:%.]], [[TAIL]] ]		; CHECK-NEXT: br label [[TAIL:%.*]]
; CHECK-NEXT: [[ADR:%.]] = getelementptr i64, i64 [[BASE:%.*]], i64 [[IV]]
; CHECK-NEXT: [[VAL:%.]] = load i64, i64 [[ADR]], align 4
; CHECK-NEXT: [[S_NEXT]] = add i64 [[S]], [[VAL]]
; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[IV]], 1
; CHECK-NEXT: [[CMP:%.*]] = icmp ne i64 [[INC]], 4
; CHECK-NEXT: br i1 [[CMP]], label [[TAIL]], label [[EXIT1:%.*]]
; CHECK: tail:		; CHECK: tail:
; CHECK-NEXT: [[CMP2:%.*]] = icmp ne i64 [[VAL]], 0		; CHECK-NEXT: [[CMP2:%.*]] = icmp ne i64 [[VAL]], 0
; CHECK-NEXT: br i1 [[CMP2]], label [[LOOP]], label [[EXIT2:%.*]]		; CHECK-NEXT: br i1 [[CMP2]], label [[LOOP_1:%.]], label [[EXIT2:%.]]
; CHECK: exit1:		; CHECK: exit1:
; CHECK-NEXT: [[S_LCSSA:%.*]] = phi i64 [ [[S]], [[LOOP]] ]		; CHECK-NEXT: [[S_LCSSA:%.]] = phi i64 [ [[S_NEXT_2:%.]], [[LOOP_3:%.*]] ]
; CHECK-NEXT: ret i64 [[S_LCSSA]]		; CHECK-NEXT: ret i64 [[S_LCSSA]]
; CHECK: exit2:		; CHECK: exit2:
; CHECK-NEXT: [[S_NEXT_LCSSA1:%.*]] = phi i64 [ [[S_NEXT]], [[TAIL]] ]		; CHECK-NEXT: [[S_NEXT_LCSSA1:%.]] = phi i64 [ [[VAL]], [[TAIL]] ], [ [[S_NEXT_1:%.]], [[TAIL_1:%.]] ], [ [[S_NEXT_2]], [[TAIL_2:%.]] ], [ [[S_NEXT_3:%.]], [[TAIL_3:%.]] ]
; CHECK-NEXT: ret i64 [[S_NEXT_LCSSA1]]		; CHECK-NEXT: ret i64 [[S_NEXT_LCSSA1]]
		; CHECK: loop.1:
		; CHECK-NEXT: [[ADR_1:%.]] = getelementptr i64, i64 [[BASE]], i64 1
		; CHECK-NEXT: [[VAL_1:%.]] = load i64, i64 [[ADR_1]], align 4
		; CHECK-NEXT: [[S_NEXT_1]] = add i64 [[VAL]], [[VAL_1]]
		; CHECK-NEXT: br label [[TAIL_1]]
		; CHECK: tail.1:
		; CHECK-NEXT: [[CMP2_1:%.*]] = icmp ne i64 [[VAL_1]], 0
		; CHECK-NEXT: br i1 [[CMP2_1]], label [[LOOP_2:%.*]], label [[EXIT2]]
		; CHECK: loop.2:
		; CHECK-NEXT: [[ADR_2:%.]] = getelementptr i64, i64 [[BASE]], i64 2
		; CHECK-NEXT: [[VAL_2:%.]] = load i64, i64 [[ADR_2]], align 4
		; CHECK-NEXT: [[S_NEXT_2]] = add i64 [[S_NEXT_1]], [[VAL_2]]
		; CHECK-NEXT: br label [[TAIL_2]]
		; CHECK: tail.2:
		; CHECK-NEXT: [[CMP2_2:%.*]] = icmp ne i64 [[VAL_2]], 0
		; CHECK-NEXT: br i1 [[CMP2_2]], label [[LOOP_3]], label [[EXIT2]]
		; CHECK: loop.3:
		; CHECK-NEXT: [[ADR_3:%.]] = getelementptr i64, i64 [[BASE]], i64 3
		; CHECK-NEXT: [[VAL_3:%.]] = load i64, i64 [[ADR_3]], align 4
		; CHECK-NEXT: [[S_NEXT_3]] = add i64 [[S_NEXT_2]], [[VAL_3]]
		; CHECK-NEXT: br i1 false, label [[TAIL_3]], label [[EXIT1:%.*]]
		; CHECK: tail.3:
		; CHECK-NEXT: br label [[EXIT2]]
;		;
entry:		entry:
br label %loop		br label %loop

loop:		loop:
%iv = phi i64 [ 0, %entry ], [ %inc, %tail ]		%iv = phi i64 [ 0, %entry ], [ %inc, %tail ]
%s = phi i64 [ 0, %entry ], [ %s.next, %tail ]		%s = phi i64 [ 0, %entry ], [ %s.next, %tail ]
%adr = getelementptr i64, i64* %base, i64 %iv		%adr = getelementptr i64, i64* %base, i64 %iv
Show All 15 Lines
}		}

; SCEV properly unrolls multi-exit loops.		; SCEV properly unrolls multi-exit loops.
define i32 @multiExit(i32* %base) nounwind {		define i32 @multiExit(i32* %base) nounwind {
; CHECK-LABEL: @multiExit(		; CHECK-LABEL: @multiExit(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: br label [[L1:%.*]]		; CHECK-NEXT: br label [[L1:%.*]]
; CHECK: l1:		; CHECK: l1:
; CHECK-NEXT: [[IV1:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[INC1:%.]], [[L2:%.]] ]		; CHECK-NEXT: [[VAL:%.]] = load i32, i32 [[BASE:%.*]], align 4
; CHECK-NEXT: [[INC1]] = add nuw nsw i32 [[IV1]], 1		; CHECK-NEXT: br i1 false, label [[L2:%.]], label [[EXIT1:%.]]
; CHECK-NEXT: [[ADR:%.]] = getelementptr i32, i32 [[BASE:%.*]], i32 [[IV1]]
; CHECK-NEXT: [[VAL:%.]] = load i32, i32 [[ADR]], align 4
; CHECK-NEXT: br i1 false, label [[L2]], label [[EXIT1:%.*]]
; CHECK: l2:		; CHECK: l2:
; CHECK-NEXT: br i1 true, label [[L1]], label [[EXIT2:%.*]]		; CHECK-NEXT: ret i32 [[VAL]]
; CHECK: exit1:		; CHECK: exit1:
; CHECK-NEXT: ret i32 1		; CHECK-NEXT: ret i32 1
; CHECK: exit2:
; CHECK-NEXT: [[VAL_LCSSA1:%.*]] = phi i32 [ [[VAL]], [[L2]] ]
; CHECK-NEXT: ret i32 [[VAL_LCSSA1]]
;		;
entry:		entry:
br label %l1		br label %l1
l1:		l1:
%iv1 = phi i32 [ 0, %entry ], [ %inc1, %l2 ]		%iv1 = phi i32 [ 0, %entry ], [ %inc1, %l2 ]
%iv2 = phi i32 [ 0, %entry ], [ %inc2, %l2 ]		%iv2 = phi i32 [ 0, %entry ], [ %inc2, %l2 ]
%inc1 = add i32 %iv1, 1		%inc1 = add i32 %iv1, 1
%inc2 = add i32 %iv2, 1		%inc2 = add i32 %iv2, 1
%adr = getelementptr i32, i32* %base, i32 %iv1		%adr = getelementptr i32, i32* %base, i32 %iv1
%val = load i32, i32* %adr		%val = load i32, i32* %adr
%cmp1 = icmp slt i32 %iv1, 5		%cmp1 = icmp slt i32 %iv1, 5
br i1 %cmp1, label %l2, label %exit1		br i1 %cmp1, label %l2, label %exit1
l2:		l2:
%cmp2 = icmp slt i32 %iv2, 10		%cmp2 = icmp slt i32 %iv2, 10
br i1 %cmp2, label %l1, label %exit2		br i1 %cmp2, label %l1, label %exit2
exit1:		exit1:
ret i32 1		ret i32 1
exit2:		exit2:
ret i32 %val		ret i32 %val
}		}


; SCEV should not unroll a multi-exit loops unless the latch block has		; SCEV can unroll a multi-exit loops even if the latch block has no
; a known trip count, regardless of the early exit trip counts. The		; known trip count, but an early exit has a known trip count. In this
; LoopUnroll utility uses this assumption to optimize the latch		; case we must be careful not to optimize the latch branch away.
; block's branch.
define i32 @multiExitIncomplete(i32* %base) nounwind {		define i32 @multiExitIncomplete(i32* %base) nounwind {
; CHECK-LABEL: @multiExitIncomplete(		; CHECK-LABEL: @multiExitIncomplete(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: br label [[L1:%.*]]		; CHECK-NEXT: br label [[L1:%.*]]
; CHECK: l1:		; CHECK: l1:
; CHECK-NEXT: [[IV1:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[INC1:%.]], [[L3:%.]] ]		; CHECK-NEXT: [[VAL:%.]] = load i32, i32 [[BASE:%.*]], align 4
; CHECK-NEXT: [[INC1]] = add nuw i32 [[IV1]], 1		; CHECK-NEXT: br label [[L2:%.*]]
; CHECK-NEXT: [[ADR:%.]] = getelementptr i32, i32 [[BASE:%.*]], i32 [[IV1]]
; CHECK-NEXT: [[VAL:%.]] = load i32, i32 [[ADR]], align 4
; CHECK-NEXT: [[CMP1:%.*]] = icmp ult i32 [[IV1]], 5
; CHECK-NEXT: br i1 [[CMP1]], label [[L2:%.]], label [[EXIT1:%.]]
; CHECK: l2:		; CHECK: l2:
; CHECK-NEXT: br i1 true, label [[L3]], label [[EXIT2:%.*]]		; CHECK-NEXT: br label [[L3:%.*]]
; CHECK: l3:		; CHECK: l3:
; CHECK-NEXT: [[CMP3:%.*]] = icmp ne i32 [[VAL]], 0		; CHECK-NEXT: [[CMP3:%.*]] = icmp ne i32 [[VAL]], 0
; CHECK-NEXT: br i1 [[CMP3]], label [[L1]], label [[EXIT3:%.*]]		; CHECK-NEXT: br i1 [[CMP3]], label [[L1_1:%.]], label [[EXIT3:%.]]
; CHECK: exit1:		; CHECK: exit1:
; CHECK-NEXT: ret i32 1		; CHECK-NEXT: ret i32 1
; CHECK: exit2:		; CHECK: exit2:
; CHECK-NEXT: ret i32 2		; CHECK-NEXT: ret i32 2
; CHECK: exit3:		; CHECK: exit3:
; CHECK-NEXT: ret i32 3		; CHECK-NEXT: ret i32 3
		; CHECK: l1.1:
		; CHECK-NEXT: [[ADR_1:%.]] = getelementptr i32, i32 [[BASE]], i32 1
		; CHECK-NEXT: [[VAL_1:%.]] = load i32, i32 [[ADR_1]], align 4
		; CHECK-NEXT: br label [[L2_1:%.*]]
		; CHECK: l2.1:
		; CHECK-NEXT: br label [[L3_1:%.*]]
		; CHECK: l3.1:
		; CHECK-NEXT: [[CMP3_1:%.*]] = icmp ne i32 [[VAL_1]], 0
		; CHECK-NEXT: br i1 [[CMP3_1]], label [[L1_2:%.*]], label [[EXIT3]]
		; CHECK: l1.2:
		; CHECK-NEXT: [[ADR_2:%.]] = getelementptr i32, i32 [[BASE]], i32 2
		; CHECK-NEXT: [[VAL_2:%.]] = load i32, i32 [[ADR_2]], align 4
		; CHECK-NEXT: br label [[L2_2:%.*]]
		; CHECK: l2.2:
		; CHECK-NEXT: br label [[L3_2:%.*]]
		; CHECK: l3.2:
		; CHECK-NEXT: [[CMP3_2:%.*]] = icmp ne i32 [[VAL_2]], 0
		; CHECK-NEXT: br i1 [[CMP3_2]], label [[L1_3:%.*]], label [[EXIT3]]
		; CHECK: l1.3:
		; CHECK-NEXT: [[ADR_3:%.]] = getelementptr i32, i32 [[BASE]], i32 3
		; CHECK-NEXT: [[VAL_3:%.]] = load i32, i32 [[ADR_3]], align 4
		; CHECK-NEXT: br label [[L2_3:%.*]]
		; CHECK: l2.3:
		; CHECK-NEXT: br label [[L3_3:%.*]]
		; CHECK: l3.3:
		; CHECK-NEXT: [[CMP3_3:%.*]] = icmp ne i32 [[VAL_3]], 0
		; CHECK-NEXT: br i1 [[CMP3_3]], label [[L1_4:%.*]], label [[EXIT3]]
		; CHECK: l1.4:
		; CHECK-NEXT: [[ADR_4:%.]] = getelementptr i32, i32 [[BASE]], i32 4
		; CHECK-NEXT: [[VAL_4:%.]] = load i32, i32 [[ADR_4]], align 4
		; CHECK-NEXT: br label [[L2_4:%.*]]
		; CHECK: l2.4:
		; CHECK-NEXT: br label [[L3_4:%.*]]
		; CHECK: l3.4:
		; CHECK-NEXT: [[CMP3_4:%.*]] = icmp ne i32 [[VAL_4]], 0
		; CHECK-NEXT: br i1 [[CMP3_4]], label [[L1_5:%.*]], label [[EXIT3]]
		; CHECK: l1.5:
		; CHECK-NEXT: br i1 false, label [[L2_5:%.]], label [[EXIT1:%.]]
		; CHECK: l2.5:
		; CHECK-NEXT: br i1 true, label [[L3_5:%.]], label [[EXIT2:%.]]
		; CHECK: l3.5:
		; CHECK-NEXT: br label [[EXIT3]]
;		;
entry:		entry:
br label %l1		br label %l1
l1:		l1:
%iv1 = phi i32 [ 0, %entry ], [ %inc1, %l3 ]		%iv1 = phi i32 [ 0, %entry ], [ %inc1, %l3 ]
%iv2 = phi i32 [ 0, %entry ], [ %inc2, %l3 ]		%iv2 = phi i32 [ 0, %entry ], [ %inc2, %l3 ]
%inc1 = add i32 %iv1, 1		%inc1 = add i32 %iv1, 1
%inc2 = add i32 %iv2, 1		%inc2 = add i32 %iv2, 1
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
; iteration via the early exit. So loop unrolling cannot assume that		; iteration via the early exit. So loop unrolling cannot assume that
; the loop latch's exit count of zero is an upper bound on the number		; the loop latch's exit count of zero is an upper bound on the number
; of iterations.		; of iterations.
define void @nsw_latch(i32* %a) nounwind {		define void @nsw_latch(i32* %a) nounwind {
; CHECK-LABEL: @nsw_latch(		; CHECK-LABEL: @nsw_latch(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: br label [[FOR_BODY:%.*]]		; CHECK-NEXT: br label [[FOR_BODY:%.*]]
; CHECK: for.body:		; CHECK: for.body:
; CHECK-NEXT: [[B_03:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[ADD:%.]], [[FOR_COND:%.]] ]		; CHECK-NEXT: br label [[FOR_COND:%.*]]
; CHECK-NEXT: [[TOBOOL:%.*]] = icmp eq i32 [[B_03]], 0
; CHECK-NEXT: [[ADD]] = add nuw nsw i32 [[B_03]], 8
; CHECK-NEXT: br i1 [[TOBOOL]], label [[FOR_COND]], label [[RETURN:%.*]]
; CHECK: for.cond:		; CHECK: for.cond:
; CHECK-NEXT: br i1 false, label [[RETURN]], label [[FOR_BODY]]		; CHECK-NEXT: br i1 false, label [[RETURN:%.]], label [[FOR_BODY_1:%.]]
; CHECK: return:		; CHECK: return:
; CHECK-NEXT: [[B_03_LCSSA:%.*]] = phi i32 [ 8, [[FOR_BODY]] ], [ 0, [[FOR_COND]] ]		; CHECK-NEXT: [[B_03_LCSSA:%.]] = phi i32 [ 0, [[FOR_COND]] ], [ 8, [[FOR_BODY_1]] ], [ 0, [[FOR_COND_1:%.]] ]
; CHECK-NEXT: [[RETVAL_0:%.*]] = phi i32 [ 1, [[FOR_BODY]] ], [ 0, [[FOR_COND]] ]		; CHECK-NEXT: [[RETVAL_0:%.*]] = phi i32 [ 0, [[FOR_COND]] ], [ 1, [[FOR_BODY_1]] ], [ 0, [[FOR_COND_1]] ]
; CHECK-NEXT: store i32 [[B_03_LCSSA]], i32* [[A:%.*]], align 4		; CHECK-NEXT: store i32 [[B_03_LCSSA]], i32* [[A:%.*]], align 4
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
		; CHECK: for.body.1:
		; CHECK-NEXT: br i1 false, label [[FOR_COND_1]], label [[RETURN]]
		; CHECK: for.cond.1:
		; CHECK-NEXT: br label [[RETURN]]
;		;
entry:		entry:
br label %for.body		br label %for.body

for.body: ; preds = %for.cond, %entry		for.body: ; preds = %for.cond, %entry
%b.03 = phi i32 [ 0, %entry ], [ %add, %for.cond ]		%b.03 = phi i32 [ 0, %entry ], [ %add, %for.cond ]
%tobool = icmp eq i32 %b.03, 0		%tobool = icmp eq i32 %b.03, 0
%add = add nsw i32 %b.03, 8		%add = add nsw i32 %b.03, 8
Show All 12 Lines

llvm/test/Transforms/LoopUnroll/unloop.ll

	Show First 20 Lines • Show All 477 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: br label [[FOR_END78:%.*]]			; CHECK-NEXT: br label [[FOR_END78:%.*]]
	; CHECK: for.end78:			; CHECK: for.end78:
	; CHECK-NEXT: br i1 false, label [[PROC2_EXIT:%.]], label [[FOR_COND_I_PREHEADER:%.]]			; CHECK-NEXT: br i1 false, label [[PROC2_EXIT:%.]], label [[FOR_COND_I_PREHEADER:%.]]
	; CHECK: for.cond.i.preheader:			; CHECK: for.cond.i.preheader:
	; CHECK-NEXT: br label [[FOR_COND_I:%.*]]			; CHECK-NEXT: br label [[FOR_COND_I:%.*]]
	; CHECK: for.cond.i:			; CHECK: for.cond.i:
	; CHECK-NEXT: br label [[FOR_COND_I]]			; CHECK-NEXT: br label [[FOR_COND_I]]
	; CHECK: Proc2.exit:			; CHECK: Proc2.exit:
	; CHECK-NEXT: br label [[FOR_COND31]]			; CHECK-NEXT: unreachable
	; CHECK: for.end94:			; CHECK: for.end94:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
	;			;
	entry:			entry:
	br label %for.cond31			br label %for.cond31

	for.cond31:			for.cond31:
	br i1 undef, label %for.body35, label %for.end94			br i1 undef, label %for.body35, label %for.end94
	▲ Show 20 Lines • Show All 176 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopUnroll/unroll-header-exiting-with-phis-multiple-exiting-blocks.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -loop-unroll -S %s \| FileCheck %s			; RUN: opt -loop-unroll -S %s \| FileCheck %s

	; Loop with multiple exiting blocks, where the header exits but not the latch,			; Loop with multiple exiting blocks, where the header exits but not the latch,
	; e.g. because it has not been rotated.			; e.g. because it has not been rotated.
	define i16 @full_unroll_multiple_exiting_blocks(i16* %A, i16 %x, i16 %y) {			define i16 @full_unroll_multiple_exiting_blocks(i16* %A, i16 %x, i16 %y) {
	; CHECK-LABEL: @full_unroll_multiple_exiting_blocks(			; CHECK-LABEL: @full_unroll_multiple_exiting_blocks(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[HEADER:%.*]]			; CHECK-NEXT: br label [[HEADER:%.*]]
	; CHECK: header:			; CHECK: header:
	; CHECK-NEXT: [[RES:%.]] = phi i16 [ 123, [[ENTRY:%.]] ], [ [[RES_NEXT:%.]], [[LATCH:%.]] ]			; CHECK-NEXT: [[LV:%.]] = load i16, i16 [[A:%.*]], align 2
	; CHECK-NEXT: [[I_0:%.]] = phi i64 [ 0, [[ENTRY]] ], [ [[INC9:%.]], [[LATCH]] ]			; CHECK-NEXT: [[RES_NEXT:%.*]] = add i16 123, [[LV]]
	; CHECK-NEXT: [[PTR:%.]] = getelementptr inbounds i16, i16 [[A:%.*]], i64 [[I_0]]			; CHECK-NEXT: br label [[EXITING_1:%.*]]
	; CHECK-NEXT: [[LV:%.]] = load i16, i16 [[PTR]], align 2
	; CHECK-NEXT: [[RES_NEXT]] = add i16 [[RES]], [[LV]]
	; CHECK-NEXT: [[CMP:%.*]] = icmp ult i64 [[I_0]], 3
	; CHECK-NEXT: br i1 [[CMP]], label [[EXITING_1:%.]], label [[EXIT:%.]]
	; CHECK: exiting.1:			; CHECK: exiting.1:
	; CHECK-NEXT: [[EC_1:%.]] = icmp eq i16 [[LV]], [[X:%.]]			; CHECK-NEXT: [[EC_1:%.]] = icmp eq i16 [[LV]], [[X:%.]]
	; CHECK-NEXT: br i1 [[EC_1]], label [[EXIT]], label [[EXITING_2:%.*]]			; CHECK-NEXT: br i1 [[EC_1]], label [[EXIT:%.]], label [[EXITING_2:%.]]
	; CHECK: exiting.2:			; CHECK: exiting.2:
	; CHECK-NEXT: [[EC_2:%.]] = icmp eq i16 [[LV]], [[Y:%.]]			; CHECK-NEXT: [[EC_2:%.]] = icmp eq i16 [[LV]], [[Y:%.]]
	; CHECK-NEXT: br i1 [[EC_2]], label [[EXIT]], label [[LATCH]]			; CHECK-NEXT: br i1 [[EC_2]], label [[EXIT]], label [[LATCH:%.*]]
	; CHECK: latch:			; CHECK: latch:
	; CHECK-NEXT: [[INC9]] = add i64 [[I_0]], 1			; CHECK-NEXT: [[PTR_1:%.]] = getelementptr inbounds i16, i16 [[A]], i64 1
	; CHECK-NEXT: br label [[HEADER]]			; CHECK-NEXT: [[LV_1:%.]] = load i16, i16 [[PTR_1]], align 2
				; CHECK-NEXT: [[RES_NEXT_1:%.*]] = add i16 [[RES_NEXT]], [[LV_1]]
				; CHECK-NEXT: br label [[EXITING_1_1:%.*]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RES_LCSSA:%.*]] = phi i16 [ [[RES_NEXT]], [[HEADER]] ], [ 0, [[EXITING_1]] ], [ 1, [[EXITING_2]] ]			; CHECK-NEXT: [[RES_LCSSA:%.]] = phi i16 [ 0, [[EXITING_1]] ], [ 1, [[EXITING_2]] ], [ 0, [[EXITING_1_1]] ], [ 1, [[EXITING_2_1:%.]] ], [ 0, [[EXITING_1_2:%.]] ], [ 1, [[EXITING_2_2:%.]] ], [ [[RES_NEXT_3:%.]], [[LATCH_2:%.]] ], [ 0, [[EXITING_1_3:%.]] ], [ 1, [[EXITING_2_3:%.]] ]
	; CHECK-NEXT: ret i16 [[RES_LCSSA]]			; CHECK-NEXT: ret i16 [[RES_LCSSA]]
				; CHECK: exiting.1.1:
				; CHECK-NEXT: [[EC_1_1:%.*]] = icmp eq i16 [[LV_1]], [[X]]
				; CHECK-NEXT: br i1 [[EC_1_1]], label [[EXIT]], label [[EXITING_2_1]]
				; CHECK: exiting.2.1:
				; CHECK-NEXT: [[EC_2_1:%.*]] = icmp eq i16 [[LV_1]], [[Y]]
				; CHECK-NEXT: br i1 [[EC_2_1]], label [[EXIT]], label [[LATCH_1:%.*]]
				; CHECK: latch.1:
				; CHECK-NEXT: [[PTR_2:%.]] = getelementptr inbounds i16, i16 [[A]], i64 2
				; CHECK-NEXT: [[LV_2:%.]] = load i16, i16 [[PTR_2]], align 2
				; CHECK-NEXT: [[RES_NEXT_2:%.*]] = add i16 [[RES_NEXT_1]], [[LV_2]]
				; CHECK-NEXT: br label [[EXITING_1_2]]
				; CHECK: exiting.1.2:
				; CHECK-NEXT: [[EC_1_2:%.*]] = icmp eq i16 [[LV_2]], [[X]]
				; CHECK-NEXT: br i1 [[EC_1_2]], label [[EXIT]], label [[EXITING_2_2]]
				; CHECK: exiting.2.2:
				; CHECK-NEXT: [[EC_2_2:%.*]] = icmp eq i16 [[LV_2]], [[Y]]
				; CHECK-NEXT: br i1 [[EC_2_2]], label [[EXIT]], label [[LATCH_2]]
				; CHECK: latch.2:
				; CHECK-NEXT: [[PTR_3:%.]] = getelementptr inbounds i16, i16 [[A]], i64 3
				; CHECK-NEXT: [[LV_3:%.]] = load i16, i16 [[PTR_3]], align 2
				; CHECK-NEXT: [[RES_NEXT_3]] = add i16 [[RES_NEXT_2]], [[LV_3]]
				; CHECK-NEXT: br i1 false, label [[EXITING_1_3]], label [[EXIT]]
				; CHECK: exiting.1.3:
				; CHECK-NEXT: [[EC_1_3:%.*]] = icmp eq i16 [[LV_3]], [[X]]
				; CHECK-NEXT: br i1 [[EC_1_3]], label [[EXIT]], label [[EXITING_2_3]]
				; CHECK: exiting.2.3:
				; CHECK-NEXT: [[EC_2_3:%.*]] = icmp eq i16 [[LV_3]], [[Y]]
				; CHECK-NEXT: br i1 [[EC_2_3]], label [[EXIT]], label [[LATCH_3:%.*]]
				; CHECK: latch.3:
				; CHECK-NEXT: unreachable
	;			;
	entry:			entry:
	br label %header			br label %header

	header:			header:
	%res = phi i16 [ 123, %entry ], [ %res.next, %latch ]			%res = phi i16 [ 123, %entry ], [ %res.next, %latch ]
	%i.0 = phi i64 [ 0, %entry ], [ %inc9, %latch ]			%i.0 = phi i64 [ 0, %entry ], [ %inc9, %latch ]
	%ptr = getelementptr inbounds i16, i16* %A, i64 %i.0			%ptr = getelementptr inbounds i16, i16* %A, i64 %i.0
	Show All 21 Lines