This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
2/7
LoopPeel.cpp
-
test/Transforms/LoopUnroll/
-
Transforms/
-
LoopUnroll/
-
peel-loop-pgo-deopt-idom-2.ll
-
peel-loop-pgo-deopt-idom.ll
-
peel-loop-pgo-deopt.ll
-
peel-multiple-unreachable-exits.ll

Differential D108108

[LoopPeel] Allow peeling with multiple unreachable-terminated exit blocks.
ClosedPublic

Authored by fhahn on Aug 16 2021, 2:09 AM.

Download Raw Diff

Details

Reviewers

reames
efriedma
skatkov
mkazantsev
nikic

Commits

rG90d09eb300db: [LoopPeel] Allow peeling with multiple unreachable-terminated exit blocks.

Summary

Support for peeling with multiple exit blocks was added in D63921/77bb3a486fa6.

So far it has only been enabled for loops where all non-latch exits are
'de-optimizing' exits (D63923). But peeling of multi-exit loops can be
highly beneficial in other cases too, like if all non-latch exiting
blocks are unreachable.

The motivating case are loops with runtime checks, like the C++ example
below. The main issue preventing vectorization is that the invariant
accesses to load the bounds of B is conditionally executed in the loop
and cannot be hoisted out. If we peel off the first iteration, they
become dereferenceable in the loop, because they must execute before the
loop is executed, as all non-latch exits are terminated with
unreachable. This subsequently allows hoisting the loads and runtime
checks out of the loop, allowing vectorization of the loop.

int sum(std::vector<int> *A, std::vector<int> *B, int N) {
  int cost = 0;
  for (int i = 0; i < N; ++i)
    cost += A->at(i) + B->at(i);
  return cost;
}

This gives a ~20-30% increase of score for Geekbench5/HDR on AArch64.

Note that this requires a follow-up improvement to the peeling cost
model to actually peel iterations off loops as above. I will share that
shortly.

Also, peeling of multi-exits might be beneficial for exit blocks with
other terminators, but I would like to keep the scope limited to known
high-reward cases for now.

I removed the option to disable peeling for multi-deopt exits because
the code is more general now. Alternatively, the option could also be
generalized, but I am not sure if there's much value in the option?

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fhahn created this revision.Aug 16 2021, 2:09 AM

Herald added subscribers: wenlei, zzheng, hiraditya, kristof.beyls. · View Herald TranscriptAug 16 2021, 2:09 AM

fhahn requested review of this revision.Aug 16 2021, 2:09 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 16 2021, 2:09 AM

Can you please also at least add a test with

unreachable.exit.foo: ; preds = <...>
  call void @foo()
  br label %unreachable.exit.end

unreachable.exit.bar: ; preds = <...>
  call void @bar()
  br label %unreachable.exit.end

unreachable.exit.end:
  call void @baz()
  unreachable

nikic added a reviewer: nikic.Aug 16 2021, 2:45 AM

Harbormaster completed remote builds in B119665: Diff 366569.Aug 16 2021, 2:47 AM

fhahn mentioned this in rG38c3cebd7d5a: [LoopPeel] Add test with multiple exit blocks branching to unreachable..Aug 16 2021, 3:54 AM

In D108108#2946286, @lebedev.ri wrote:

Can you please also at least add a test with

unreachable.exit.foo: ; preds = <...>
  call void @foo()
  br label %unreachable.exit.end

unreachable.exit.bar: ; preds = <...>
  call void @bar()
  br label %unreachable.exit.end

unreachable.exit.end:
  call void @baz()
  unreachable

Thanks, should be added in 38c3cebd7d5a. Note that the current version doesn't support that case at the moment, as this would probably require a bit more complex checking.

fhahn added a child revision: D108114: [LoopPeel] Peel if it turns invariant loads dereferenceable..Aug 16 2021, 3:56 AM

lebedev.ri added inline comments.Aug 16 2021, 2:24 PM

llvm/lib/Transforms/Utils/LoopPeel.cpp
103–109	(As per brief disscussion in IRC) since this is about loop reentry, how about something like this then?

LGTM w/comment suggested.

llvm/lib/Transforms/Utils/LoopPeel.cpp
105	Can you add something to this comment to indicate that a) this is a proxy for a strong profile prediction of untaken, and b) this is a profitability check not a legality check?

This revision is now accepted and ready to land.Aug 17 2021, 9:13 AM

nikic added inline comments.Aug 17 2021, 9:47 AM

llvm/lib/Transforms/Utils/LoopPeel.cpp
105	I would really appreciate an explanation of why we want this heuristic... it's not really obvious to me why this is only beneficial if the other exits are likely not taken.

reames added inline comments.Aug 17 2021, 9:53 AM

llvm/lib/Transforms/Utils/LoopPeel.cpp
105	I don't know this is Florian's answer, but mine would be: incrementalism and minimizing blast radius. We may eventually want to generalize further, but one step at a time.

mnadeem added a subscriber: mnadeem.Aug 18 2021, 5:50 PM

skatkov added inline comments.Aug 22 2021, 9:15 PM

llvm/lib/Transforms/Utils/LoopPeel.cpp
105	I guess the problem is that LoopPeeling cannot update branch weights for other branches than latch. If we can teach to LoopPeeling to update branch weights for non latch branches - we may remove this restriction however it does not look easy task. Branch weights to deopt and unreachable is overwritten to be smallest one, so its update is simple (no update). This is how I remember a problem.

fhahn added inline comments.Aug 23 2021, 8:06 AM

llvm/lib/Transforms/Utils/LoopPeel.cpp
103–109	@lebedev.ri I agree that this relaxation would be fine/good for the existing cost heuristics, but as @skatov mentioned, the problem with that case would be updating the branch weights. Unless you have additional concerns, I am planning on going with `unreachable`-terminated exits for now to avoid messing up branch weights. WDYT?
105	Thanks for taking a look! My main motivation was as Philip suggested mostly to keep the potential fallout limited and additionally allowing exit blocks terminated by `unreachable` should not negatively impact the existing heuristics. @skatov thanks for providing the extra context with respect to branch weights, I'll include that in the updated comment.

I see. Thanks, SGTM, looking forward to this, and further improvements!

rebased and comment added. I'll land this shortly.

Harbormaster completed remote builds in B121139: Diff 368600.Aug 25 2021, 4:44 AM

This revision was landed with ongoing or failed builds.Aug 25 2021, 5:27 AM

Closed by commit rG90d09eb300db: [LoopPeel] Allow peeling with multiple unreachable-terminated exit blocks. (authored by fhahn). · Explain Why

This revision was automatically updated to reflect the committed changes.

fhahn added a commit: rG90d09eb300db: [LoopPeel] Allow peeling with multiple unreachable-terminated exit blocks..

lebedev.ri mentioned this in D110922: [LoopPeel] Peel loops with exits followed by an unreachable or deopt block.Oct 1 2021, 5:05 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Utils/

LoopPeel.cpp

35 lines

test/

Transforms/

LoopUnroll/

peel-loop-pgo-deopt-idom-2.ll

4 lines

peel-loop-pgo-deopt-idom.ll

4 lines

peel-loop-pgo-deopt.ll

6 lines

peel-multiple-unreachable-exits.ll

40 lines

Diff 366569

llvm/lib/Transforms/Utils/LoopPeel.cpp

Show First 20 Lines • Show All 67 Lines • ▼ Show 20 Lines

static cl::opt<unsigned> UnrollPeelMaxCount( static cl::opt<unsigned> UnrollPeelMaxCount(

"unroll-peel-max-count", cl::init(7), cl::Hidden, "unroll-peel-max-count", cl::init(7), cl::Hidden,

cl::desc("Max average trip count which will cause loop peeling.")); cl::desc("Max average trip count which will cause loop peeling."));

static cl::opt<unsigned> UnrollForcePeelCount( static cl::opt<unsigned> UnrollForcePeelCount(

"unroll-force-peel-count", cl::init(0), cl::Hidden, "unroll-force-peel-count", cl::init(0), cl::Hidden,

cl::desc("Force a peel count regardless of profiling information.")); cl::desc("Force a peel count regardless of profiling information."));

static cl::opt<bool> UnrollPeelMultiDeoptExit(

"unroll-peel-multi-deopt-exit", cl::init(true), cl::Hidden,

cl::desc("Allow peeling of loops with multiple deopt exits."));

static const char *PeeledCountMetaData = "llvm.loop.peeled.count"; static const char *PeeledCountMetaData = "llvm.loop.peeled.count";

// Designates that a Phi is estimated to become invariant after an "infinite" // Designates that a Phi is estimated to become invariant after an "infinite"

// number of loop iterations (i.e. only may become an invariant if the loop is // number of loop iterations (i.e. only may become an invariant if the loop is

// fully unrolled). // fully unrolled).

static const unsigned InfiniteIterationsToInvariance = static const unsigned InfiniteIterationsToInvariance =

std::numeric_limits<unsigned>::max(); std::numeric_limits<unsigned>::max();

// Check whether we are capable of peeling this loop. // Check whether we are capable of peeling this loop.

bool llvm::canPeel(Loop *L) { bool llvm::canPeel(Loop *L) {

// Make sure the loop is in simplified form // Make sure the loop is in simplified form

if (!L->isLoopSimplifyForm()) if (!L->isLoopSimplifyForm())

return false; return false;

if (UnrollPeelMultiDeoptExit) {

SmallVector<BasicBlock *, 4> Exits;

L->getUniqueNonLatchExitBlocks(Exits);

if (!Exits.empty()) {

// Latch's terminator is a conditional branch, Latch is exiting and

// all non Latch exits ends up with deoptimize.

const BasicBlock *Latch = L->getLoopLatch();

const BranchInst *T = dyn_cast<BranchInst>(Latch->getTerminator());

return T && T->isConditional() && L->isLoopExiting(Latch) &&

all_of(Exits, [](const BasicBlock *BB) {

return BB->getTerminatingDeoptimizeCall();

});

}

// Only peel loops that contain a single exit

if (!L->getExitingBlock() || !L->getUniqueExitBlock())

return false;

// Don't try to peel loops where the latch is not the exiting block. // Don't try to peel loops where the latch is not the exiting block.

// This can be an indication of two different things: // This can be an indication of two different things:

// 1) The loop is not rotated. // 1) The loop is not rotated.

// 2) The loop contains irreducible control flow that involves the latch. // 2) The loop contains irreducible control flow that involves the latch.

const BasicBlock *Latch = L->getLoopLatch(); const BasicBlock *Latch = L->getLoopLatch();

if (Latch != L->getExitingBlock()) if (!L->isLoopExiting(Latch))

return false; return false;

// Peeling is only supported if the latch is a branch. // Peeling is only supported if the latch is a branch.

if (!isa<BranchInst>(Latch->getTerminator())) if (!isa<BranchInst>(Latch->getTerminator()))

return false; return false;

return true; SmallVector<BasicBlock *, 4> Exits;

L->getUniqueNonLatchExitBlocks(Exits);

// The latch must either be the only exiting block or all non-latch exit

// blocks are either have an deopt or unreachable terminator.

reamesUnsubmitted

Not Done

Can you add something to this comment to indicate that a) this is a proxy for a strong profile prediction of untaken, and b) this is a profitability check not a legality check?

reames: Can you add something to this comment to indicate that a) this is a proxy for a strong profile…

nikicUnsubmitted

Not Done

I would really appreciate an explanation of why we want this heuristic... it's not really obvious to me why this is only beneficial if the other exits are likely not taken.

nikic: I would really appreciate an explanation of why we want this heuristic... it's not really…

reamesUnsubmitted

Not Done

I don't know this is Florian's answer, but mine would be: incrementalism and minimizing blast radius. We may eventually want to generalize further, but one step at a time.

reames: I don't know this is Florian's answer, but mine would be: incrementalism and minimizing blast…

skatkovUnsubmitted

Not Done

I guess the problem is that LoopPeeling cannot update branch weights for other branches than latch.

If we can teach to LoopPeeling to update branch weights for non latch branches - we may remove this restriction however it does not look easy task.

Branch weights to deopt and unreachable is overwritten to be smallest one, so its update is simple (no update).

This is how I remember a problem.

skatkov: I guess the problem is that LoopPeeling cannot update branch weights for other branches than…

fhahnAuthorUnsubmitted

Done

Thanks for taking a look! My main motivation was as Philip suggested mostly to keep the potential fallout limited and additionally allowing exit blocks terminated by unreachable should not negatively impact the existing heuristics.

@skatov thanks for providing the extra context with respect to branch weights, I'll include that in the updated comment.

fhahn: Thanks for taking a look! My main motivation was as Philip suggested mostly to keep the…

return all_of(Exits, [](const BasicBlock *BB) {

return BB->getTerminatingDeoptimizeCall() ||

isa<UnreachableInst>(BB->getTerminator());

});

lebedev.riUnsubmitted

Not Done

SmallVector<BasicBlock *, 4> Exits;

L->getUniqueNonLatchExitBlocks(Exits);

// The latch must either be the only exiting block or all non-latch exit

// blocks are either have an deopt or unreachable terminator.

- return all_of(Exits, [](const BasicBlock *BB) {

- return BB->getTerminatingDeoptimizeCall() ||

- isa<UnreachableInst>(BB->getTerminator());

- });

+ return !isPotentiallyReachableFromMany(Exits, L->getLoopPreheader(), /*ExclusionSet=*/nullptr, DT, LI);

}

// This function calculates the number of iterations after which the given Phi

(As per brief disscussion in IRC) since this is about loop reentry, how about something like this then?

lebedev.ri: (As per brief disscussion in IRC) since this is about loop reentry, how about something like…

fhahnAuthorUnsubmitted

Done

@lebedev.ri I agree that this relaxation would be fine/good for the existing cost heuristics, but as @skatov mentioned, the problem with that case would be updating the branch weights. Unless you have additional concerns, I am planning on going with unreachable-terminated exits for now to avoid messing up branch weights. WDYT?

fhahn: @lebedev.ri I agree that this relaxation would be fine/good for the existing cost heuristics…

} }

// This function calculates the number of iterations after which the given Phi // This function calculates the number of iterations after which the given Phi

// becomes an invariant. The pre-calculated values are memorized in the map. The // becomes an invariant. The pre-calculated values are memorized in the map. The

// function (shortcut is I) is calculated according to the following definition: // function (shortcut is I) is calculated according to the following definition:

// Given %x = phi <Inputs from above the loop>, ..., [%y, %back.edge]. // Given %x = phi <Inputs from above the loop>, ..., [%y, %back.edge].

// If %y is a loop invariant, then I(%x) = 1. // If %y is a loop invariant, then I(%x) = 1.

// If %y is a Phi from the loop header, I(%x) = I(%y) + 1. // If %y is a Phi from the loop header, I(%x) = I(%y) + 1.

▲ Show 20 Lines • Show All 726 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopUnroll/peel-loop-pgo-deopt-idom-2.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: opt < %s -S -debug-only=loop-unroll -loop-unroll -unroll-runtime -unroll-peel-multi-deopt-exit 2>&1 \| FileCheck %s			; RUN: opt < %s -S -debug-only=loop-unroll -loop-unroll -unroll-runtime 2>&1 \| FileCheck %s
	; RUN: opt < %s -S -debug-only=loop-unroll -unroll-peel-multi-deopt-exit -passes='require<profile-summary>,function(require<opt-remark-emit>,loop-unroll)' 2>&1 \| FileCheck %s			; RUN: opt < %s -S -debug-only=loop-unroll -passes='require<profile-summary>,function(require<opt-remark-emit>,loop-unroll)' 2>&1 \| FileCheck %s

	; Regression test for setting the correct idom for exit blocks.			; Regression test for setting the correct idom for exit blocks.

	; CHECK: Loop Unroll: F[basic]			; CHECK: Loop Unroll: F[basic]
	; CHECK: PEELING loop %for.body with iteration count 2!			; CHECK: PEELING loop %for.body with iteration count 2!

	define i32 @basic(i32* %p, i32 %k, i1 %c1, i1 %c2) #0 !prof !3 {			define i32 @basic(i32* %p, i32 %k, i1 %c1, i1 %c2) #0 !prof !3 {
	entry:			entry:
	Show All 35 Lines

llvm/test/Transforms/LoopUnroll/peel-loop-pgo-deopt-idom.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: opt < %s -S -debug-only=loop-unroll -loop-unroll -unroll-runtime -unroll-peel-multi-deopt-exit 2>&1 \| FileCheck %s			; RUN: opt < %s -S -debug-only=loop-unroll -loop-unroll -unroll-runtime 2>&1 \| FileCheck %s
	; RUN: opt < %s -S -debug-only=loop-unroll -unroll-peel-multi-deopt-exit -passes='require<profile-summary>,function(require<opt-remark-emit>,loop-unroll)' 2>&1 \| FileCheck %s			; RUN: opt < %s -S -debug-only=loop-unroll -passes='require<profile-summary>,function(require<opt-remark-emit>,loop-unroll)' 2>&1 \| FileCheck %s

	; Regression test for setting the correct idom for exit blocks.			; Regression test for setting the correct idom for exit blocks.

	; CHECK: Loop Unroll: F[basic]			; CHECK: Loop Unroll: F[basic]
	; CHECK: PEELING loop %for.body with iteration count 2!			; CHECK: PEELING loop %for.body with iteration count 2!

	define i32 @basic(i32* %p, i32 %k, i1 %c1, i1 %c2) #0 !prof !3 {			define i32 @basic(i32* %p, i32 %k, i1 %c1, i1 %c2) #0 !prof !3 {
	entry:			entry:
	▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopUnroll/peel-loop-pgo-deopt.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: opt < %s -S -debug-only=loop-unroll -loop-unroll -unroll-runtime -unroll-peel-multi-deopt-exit 2>&1 \| FileCheck %s			; RUN: opt < %s -S -debug-only=loop-unroll -loop-unroll -unroll-runtime 2>&1 \| FileCheck %s
	; RUN: opt < %s -S -debug-only=loop-unroll -unroll-peel-multi-deopt-exit -passes='require<profile-summary>,function(require<opt-remark-emit>,loop-unroll)' 2>&1 \| FileCheck %s			; RUN: opt < %s -S -debug-only=loop-unroll -passes='require<profile-summary>,function(require<opt-remark-emit>,loop-unroll)' 2>&1 \| FileCheck %s
	; RUN: opt < %s -S -debug-only=loop-unroll -unroll-peel-multi-deopt-exit -passes='require<profile-summary>,function(require<opt-remark-emit>,loop-unroll<no-profile-peeling>)' 2>&1 \| FileCheck %s --check-prefixes=CHECK-NO-PEEL			; RUN: opt < %s -S -debug-only=loop-unroll -passes='require<profile-summary>,function(require<opt-remark-emit>,loop-unroll<no-profile-peeling>)' 2>&1 \| FileCheck %s --check-prefixes=CHECK-NO-PEEL

	; Make sure we use the profile information correctly to peel-off 3 iterations			; Make sure we use the profile information correctly to peel-off 3 iterations
	; from the loop, and update the branch weights for the peeled loop properly.			; from the loop, and update the branch weights for the peeled loop properly.
	; All side exits to deopt does not change weigths.			; All side exits to deopt does not change weigths.

	; CHECK: Loop Unroll: F[basic]			; CHECK: Loop Unroll: F[basic]
	; CHECK: PEELING loop %for.body with iteration count 4!			; CHECK: PEELING loop %for.body with iteration count 4!
	; CHECK-NO-PEEL-NOT: PEELING loop %for.body			; CHECK-NO-PEEL-NOT: PEELING loop %for.body
	▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopUnroll/peel-multiple-unreachable-exits.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -loop-unroll -S %s \| FileCheck %s			; RUN: opt -loop-unroll -S %s \| FileCheck %s

	declare void @foo()			declare void @foo()

	define void @unroll_unreachable_exit_and_latch_exit(i32* %ptr, i32 %N, i32 %x) {			define void @unroll_unreachable_exit_and_latch_exit(i32* %ptr, i32 %N, i32 %x) {
	; CHECK-LABEL: @unroll_unreachable_exit_and_latch_exit(			; CHECK-LABEL: @unroll_unreachable_exit_and_latch_exit(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
				; CHECK-NEXT: br label [[LOOP_HEADER_PEEL_BEGIN:%.*]]
				; CHECK: loop.header.peel.begin:
				; CHECK-NEXT: br label [[LOOP_HEADER_PEEL:%.*]]
				; CHECK: loop.header.peel:
				; CHECK-NEXT: [[C_PEEL:%.*]] = icmp ult i32 1, 2
				; CHECK-NEXT: br i1 [[C_PEEL]], label [[THEN_PEEL:%.]], label [[ELSE_PEEL:%.]]
				; CHECK: else.peel:
				; CHECK-NEXT: [[C_2_PEEL:%.]] = icmp eq i32 1, [[X:%.]]
				; CHECK-NEXT: br i1 [[C_2_PEEL]], label [[UNREACHABLE_EXIT:%.]], label [[LOOP_LATCH_PEEL:%.]]
				; CHECK: then.peel:
				; CHECK-NEXT: br label [[LOOP_LATCH_PEEL]]
				; CHECK: loop.latch.peel:
				; CHECK-NEXT: [[M_PEEL:%.*]] = phi i32 [ 0, [[THEN_PEEL]] ], [ [[X]], [[ELSE_PEEL]] ]
				; CHECK-NEXT: [[GEP_PEEL:%.]] = getelementptr i32, i32 [[PTR:%.*]], i32 1
				; CHECK-NEXT: store i32 [[M_PEEL]], i32* [[GEP_PEEL]], align 4
				; CHECK-NEXT: [[IV_NEXT_PEEL:%.*]] = add nuw nsw i32 1, 1
				; CHECK-NEXT: [[C_3_PEEL:%.*]] = icmp ult i32 1, 1000
				; CHECK-NEXT: br i1 [[C_3_PEEL]], label [[LOOP_HEADER_PEEL_NEXT:%.]], label [[EXIT:%.]]
				; CHECK: loop.header.peel.next:
				; CHECK-NEXT: br label [[LOOP_HEADER_PEEL_NEXT1:%.*]]
				; CHECK: loop.header.peel.next1:
				; CHECK-NEXT: br label [[ENTRY_PEEL_NEWPH:%.*]]
				; CHECK: entry.peel.newph:
	; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]			; CHECK-NEXT: br label [[LOOP_HEADER:%.*]]
	; CHECK: loop.header:			; CHECK: loop.header:
	; CHECK-NEXT: [[IV:%.]] = phi i32 [ 1, [[ENTRY:%.]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.]] ]			; CHECK-NEXT: [[IV:%.]] = phi i32 [ [[IV_NEXT_PEEL]], [[ENTRY_PEEL_NEWPH]] ], [ [[IV_NEXT:%.]], [[LOOP_LATCH:%.*]] ]
	; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[IV]], 2			; CHECK-NEXT: br i1 false, label [[THEN:%.]], label [[ELSE:%.]]
	; CHECK-NEXT: br i1 [[C]], label [[THEN:%.]], label [[ELSE:%.]]
	; CHECK: then:			; CHECK: then:
	; CHECK-NEXT: br label [[LOOP_LATCH]]			; CHECK-NEXT: br label [[LOOP_LATCH]]
	; CHECK: else:			; CHECK: else:
	; CHECK-NEXT: [[C_2:%.]] = icmp eq i32 [[IV]], [[X:%.]]			; CHECK-NEXT: [[C_2:%.*]] = icmp eq i32 [[IV]], [[X]]
	; CHECK-NEXT: br i1 [[C_2]], label [[UNREACHABLE_EXIT:%.*]], label [[LOOP_LATCH]]			; CHECK-NEXT: br i1 [[C_2]], label [[UNREACHABLE_EXIT_LOOPEXIT:%.*]], label [[LOOP_LATCH]]
	; CHECK: loop.latch:			; CHECK: loop.latch:
	; CHECK-NEXT: [[M:%.*]] = phi i32 [ 0, [[THEN]] ], [ [[X]], [[ELSE]] ]			; CHECK-NEXT: [[M:%.*]] = phi i32 [ 0, [[THEN]] ], [ [[X]], [[ELSE]] ]
	; CHECK-NEXT: [[GEP:%.]] = getelementptr i32, i32 [[PTR:%.*]], i32 [[IV]]			; CHECK-NEXT: [[GEP:%.]] = getelementptr i32, i32 [[PTR]], i32 [[IV]]
	; CHECK-NEXT: store i32 [[M]], i32* [[GEP]], align 4			; CHECK-NEXT: store i32 [[M]], i32* [[GEP]], align 4
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i32 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i32 [[IV]], 1
	; CHECK-NEXT: [[C_3:%.*]] = icmp ult i32 [[IV]], 1000			; CHECK-NEXT: [[C_3:%.*]] = icmp ult i32 [[IV]], 1000
	; CHECK-NEXT: br i1 [[C_3]], label [[LOOP_HEADER]], label [[EXIT:%.*]]			; CHECK-NEXT: br i1 [[C_3]], label [[LOOP_HEADER]], label [[EXIT_LOOPEXIT:%.*]], !llvm.loop [[LOOP0:![0-9]+]]
				; CHECK: exit.loopexit:
				; CHECK-NEXT: br label [[EXIT]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: ret void			; CHECK-NEXT: ret void
				; CHECK: unreachable.exit.loopexit:
				; CHECK-NEXT: br label [[UNREACHABLE_EXIT]]
	; CHECK: unreachable.exit:			; CHECK: unreachable.exit:
	; CHECK-NEXT: call void @foo()			; CHECK-NEXT: call void @foo()
	; CHECK-NEXT: unreachable			; CHECK-NEXT: unreachable
	;			;
	entry:			entry:
	br label %loop.header			br label %loop.header

	loop.header:			loop.header:
	▲ Show 20 Lines • Show All 130 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[LoopPeel] Allow peeling with multiple unreachable-terminated exit blocks.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 366569

llvm/lib/Transforms/Utils/LoopPeel.cpp

llvm/test/Transforms/LoopUnroll/peel-loop-pgo-deopt-idom-2.ll

llvm/test/Transforms/LoopUnroll/peel-loop-pgo-deopt-idom.ll

llvm/test/Transforms/LoopUnroll/peel-loop-pgo-deopt.ll

llvm/test/Transforms/LoopUnroll/peel-multiple-unreachable-exits.ll

[LoopPeel] Allow peeling with multiple unreachable-terminated exit blocks.
ClosedPublic