Diff 90122

lib/Transforms/Utils/LoopUnrollPeel.cpp

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	void llvm::computePeelCount(Loop *L, unsigned LoopSize,
UP.PeelCount = 0;		UP.PeelCount = 0;
if (!canPeel(L))		if (!canPeel(L))
return;		return;

// Only try to peel innermost loops.		// Only try to peel innermost loops.
if (!L->empty())		if (!L->empty())
return;		return;

		// Try to find a Phi node that has the same loop invariant as an input from
		// its only back edge. If there is such Phi, peeling 1 iteration from the
		// loop is profitable, because starting from 2nd iteration we will have an
		// invariant instead of this Phi.
		if (auto *BackEdge = L->getLoopLatch()) {
		BasicBlock *Header = L->getHeader();
		// Iterate over Phis to find one with invariant input on back edge.
		bool FoundCandidate = false;
		PHINode *Phi;
		for (auto BI = Header->begin(); (Phi = dyn_cast<PHINode>(&*BI)); ++BI) {
		mkuperUnsubmitted Done Reply Inline Actions Why the parens around (Phi = ...)? mkuper: Why the parens around (Phi = ...)?
		Value *Input = Phi->getIncomingValueForBlock(BackEdge);
		if (L->isLoopInvariant(Input)) {
		FoundCandidate = true;
		break;
		}
		}
		if (FoundCandidate) {
		DEBUG(dbgs() << "Peel one iteration to get rid of " << *Phi
		<< " because starting from 2nd iteration it is always"
		<< " an invariant\n");
		UP.PeelCount = 1;
		return;
		}
		}

// Bail if we know the statically calculated trip count.		// Bail if we know the statically calculated trip count.
// In this case we rather prefer partial unrolling.		// In this case we rather prefer partial unrolling.
if (TripCount)		if (TripCount)
return;		return;

// If the user provided a peel count, use that.		// If the user provided a peel count, use that.
bool UserPeelCount = UnrollForcePeelCount.getNumOccurrences() > 0;		bool UserPeelCount = UnrollForcePeelCount.getNumOccurrences() > 0;
if (UserPeelCount) {		if (UserPeelCount) {
Show All 29 Lines	if (*PeelCount) {
DEBUG(dbgs() << "Max peel cost: " << UP.Threshold << "\n");		DEBUG(dbgs() << "Max peel cost: " << UP.Threshold << "\n");
}		}
}		}

return;		return;
}		}

/// \brief Update the branch weights of the latch of a peeled-off loop		/// \brief Update the branch weights of the latch of a peeled-off loop
/// iteration.		/// iteration.
		mkuperUnsubmitted Done Reply Inline Actions Can you just use L->getLoopLatch()? Or is that different from what you're looking for? mkuper: Can you just use L->getLoopLatch()? Or is that different from what you're looking for?
/// This sets the branch weights for the latch of the recently peeled off loop		/// This sets the branch weights for the latch of the recently peeled off loop
		annaUnsubmitted Done Reply Inline Actions There is a method for this `getNumBackEdges`. I think you can just reuse this and have a check against 1. anna: There is a method for this `getNumBackEdges`. I think you can just reuse this and have a check…
/// iteration correctly.		/// iteration correctly.
/// Our goal is to make sure that:		/// Our goal is to make sure that:
		annaUnsubmitted Done Reply Inline Actions can change this to `for (auto Pred: predecessors(Header))` anna:* can change this to `for (auto *Pred: predecessors(Header))`
/// a) The total weight of all the copies of the loop body is preserved.		/// a) The total weight of all the copies of the loop body is preserved.
/// b) The total weight of the loop exit is preserved.		/// b) The total weight of the loop exit is preserved.
		reamesUnsubmitted Not Done Reply Inline Actions Out of curiosity, why the complexity about finding the backedge? Wouldn't all of the inputs to the phi be loop invariant in the case you're interested in? reames: Out of curiosity, why the complexity about finding the backedge? Wouldn't all of the inputs…
		mkazantsevAuthorUnsubmitted Not Done Reply Inline Actions Off course, all inputs from preheaders will be invariant. Finding the backedge for header with n predecessors takes O(n) (it needs traversal over all predecessors with "contains" check in set that takes O(1). Acquiring its input also takes O(n) for every Phi, so total complexity being O(nm) for m Phis. If we just check all inputs for being invariants, it will also take O(nm), but we will have positive results for loops with multiple back edges. The current implementation of peeling expects the loop to have 1 back edge, otherwise it will bail and we do unneded work with such loops. mkazantsev: Off course, all inputs from preheaders will be invariant. Finding the backedge for header with…
/// c) The body weight is reasonably distributed between the peeled iterations.		/// c) The body weight is reasonably distributed between the peeled iterations.
///		///
/// \param Header The copy of the header block that belongs to next iteration.		/// \param Header The copy of the header block that belongs to next iteration.
/// \param LatchBR The copy of the latch branch that belongs to this iteration.		/// \param LatchBR The copy of the latch branch that belongs to this iteration.
/// \param IterNumber The serial number of the iteration that was just		/// \param IterNumber The serial number of the iteration that was just
/// peeled off.		/// peeled off.
/// \param AvgIters The average number of iterations we expect the loop to have.		/// \param AvgIters The average number of iterations we expect the loop to have.
/// \param[in,out] PeeledHeaderWeight The total number of dynamic loop		/// \param[in,out] PeeledHeaderWeight The total number of dynamic loop
▲ Show 20 Lines • Show All 322 Lines • Show Last 20 Lines

test/Transforms/LoopUnroll/peel-loop-not-forced.ll

This file was added.

				; RUN: opt < %s -S -loop-unroll -indvars -loop-deletion -simplifycfg -instcombine \| FileCheck %s
				mkuperUnsubmitted Done Reply Inline Actions We generally prefer to test passes in isolation. Can you please make this a test for loop-unroll only? mkuper: We generally prefer to test passes in isolation. Can you please make this a test for loop…

				define i32 @invariant_backedge_1(i32 %a, i32 %b) {
				; CHECK-LABEL: @invariant_backedge_1
				; CHECK: entry:
				; CHECK-NEXT: %0 = mul i32 %b, 999
				; CHECK-NEXT: %1 = add i32 %0, %a
				; CHECK-NEXT: ret i32 %1
				entry:
				br label %loop

				loop:
				%i = phi i32 [ 0, %entry ], [ %inc, %loop ]
				%sum = phi i32 [ 0, %entry ], [ %incsum, %loop ]
				%plus = phi i32 [ %a, %entry ], [ %b, %loop ]

				%incsum = add i32 %sum, %plus
				%inc = add i32 %i, 1
				%cmp = icmp slt i32 %i, 1000
				br i1 %cmp, label %loop, label %exit

				exit:
				ret i32 %sum
				}

This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnrolling] Peel loops with invariant backedge Phi input
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 90122

lib/Transforms/Utils/LoopUnrollPeel.cpp

test/Transforms/LoopUnroll/peel-loop-not-forced.ll

This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnrolling] Peel loops with invariant backedge Phi inputClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 90122

lib/Transforms/Utils/LoopUnrollPeel.cpp

test/Transforms/LoopUnroll/peel-loop-not-forced.ll

[LoopUnrolling] Peel loops with invariant backedge Phi input
ClosedPublic