Download Raw Diff

Details

Reviewers

mcrosier
efriedma
mkazantsev
junbuml

Commits

rGac2775889571: [LoopUnroll] Only peel if a predicate becomes known in the loop body.
rL330250: [LoopUnroll] Only peel if a predicate becomes known in the loop body.

Summary

If a predicate does not become known after peeling, peeling is unlikely
to be beneficial.

Diff Detail

Event Timeline

fhahn created this revision.Mar 28 2018, 9:22 AM

fhahn mentioned this in D43876: [LoopUnroll] Peel off iterations if it makes conditions true/false..Mar 28 2018, 9:30 AM

junbuml added a subscriber: junbuml.Mar 28 2018, 10:01 AM

junbuml added inline comments.Mar 28 2018, 10:19 AM

lib/Transforms/Utils/LoopUnrollPeel.cpp
218	if Pred or !Pred
test/Transforms/LoopUnroll/peel-loop-conditions.ll
341–342	I think you should add another test to show the case where MaxPeelCount is respected.
450	Better to add comment here.

mcrosier added a reviewer: junbuml.Mar 28 2018, 11:47 AM

Thanks for having a look! I've tweaked the comments

lib/Transforms/Utils/LoopUnrollPeel.cpp
218	In terms of the original Pred, `Pred or !Pred` is precise, but at this stage Pred is set so that it is known in the peeled part (it is set to the inverse predicate earlier, if the original Pred is not known). Therefore I think it is slightly clearer to refer to just !Pred here. What do you think?
test/Transforms/LoopUnroll/peel-loop-conditions.ll
341–342	I think this test case still checks that MaxPeelCount is respected. Without checking MaxPeelCount, we would peel off 9999 iterations. I have updated the comment and tried to highlight that fact.

mkazantsev added inline comments.Mar 29 2018, 2:44 AM

lib/Transforms/Utils/LoopUnrollPeel.cpp
220	Check that `DesiredPeelCount != NewPeelCount` before asking SCEV, this will save us some compile time in most cases.

Add extra check

lib/Transforms/Utils/LoopUnrollPeel.cpp
220	Done, thanks!

I am not sure about the heuristic in general. This patch says that we peel out first K iterations if starting from K + 1th iteration the predicate becomes known false. But isn't in even more profitable to peel out K+1th iteration because the predicate that was trivially true on first iterations (and thus we could base optimizations on it) will be trivially false, and it will STILL allow you to base some optimizations on it.

I think a better approach would be to peel out another iteration if we know that Pred is true on it OR if we know that Pred is false. Either will allow us to simplify the peeled code. Please correct me if I'm wrong with my reasoning.

I am not sure what problem you are fighting in this patch.

mkazantsev added inline comments.Mar 29 2018, 3:21 AM

lib/Transforms/Utils/LoopUnrollPeel.cpp
200	BTW, I just realized that this entire optimization will be harmful if the predicate is provable for `LeftAR, RightSCEV`. In this case, something is true on EVERY iteration, and you peel out SOME iterations thinking that it would be profitable. Isn't it the real problem you are trying to win here?

In D44983#1051359, @mkazantsev wrote:

I am not sure about the heuristic in general. This patch says that we peel out first K iterations if starting from K + 1th iteration the predicate becomes known false. But isn't in even more profitable to peel out K+1th iteration because the predicate that was trivially true on first iterations (and thus we could base optimizations on it) will be trivially false, and it will STILL allow you to base some optimizations on it.

I think a better approach would be to peel out another iteration if we know that Pred is true on it OR if we know that Pred is false. Either will allow us to simplify the peeled code. Please correct me if I'm wrong with my reasoning.

I think I see what you mean. You are saying we would not have to stop peeling even once Pred changes from true to false, as the peeled code could be simplified still? We could, but I think the major benefit of this optimization would be to simplify the loop body and peeling more than necessary may prevent inlining due to the code size increase.

I am not sure what problem you are fighting in this patch.

Basically, I want to prevent peeling, if after peeling, we cannot simplify the loop body. See the @test4. Here we would peel of a few iterations and the peeled code could be simplified, but we cannot simplify the loop body after peeling. So we potentially add lots of additional instructions through peeling, with little benefit, if the trip count is large. And that may prevent inlining.

lib/Transforms/Utils/LoopUnrollPeel.cpp
200	Are you referring where either Pred or !Pred is known for `LeftAr, RightSCEV` independently of the iteration? This case should be guarded against in line 176

Thank you very much for having a look by the way! :) Please let me know if I missed something.

I see the point, but predicate being true or false on K+1th iteration also does not guarantee that we can simplify the loop. I can agree that it is better than what we have now, I just wonder if this solution is general enough to deal with problem you are dealing with.

lib/Transforms/Utils/LoopUnrollPeel.cpp
200	Right, I haven't noticed this. Thanks for pointing out!

In D44983#1051403, @mkazantsev wrote:

I see the point, but predicate being true or false on K+1th iteration also does not guarantee that we can simplify the loop. I can agree that it is better than what we have now, I just wonder if this solution is general enough to deal with problem you are dealing with.

Ah yes, I see the case you are referring to now. Maybe it would be worth to add a check for monotonic predicates?

In D44983#1051407, @fhahn wrote:

In D44983#1051403, @mkazantsev wrote:

I see the point, but predicate being true or false on K+1th iteration also does not guarantee that we can simplify the loop. I can agree that it is better than what we have now, I just wonder if this solution is general enough to deal with problem you are dealing with.

Ah yes, I see the case you are referring to now. Maybe it would be worth to add a check for monotonic predicates?

As an option, yes. Alternatively, you know the number of iterations you are peeling out, meaning that you can use evaluateAtIteration to get what remains in loop after your manipulations. If the predicate or !predicate is known for this value, then it is what we want.

Update to peel only for monotonic predicates, that switch from true to false (or vice versa). In that case, we should be able to simplify the loop body.

For ICMP_EQ and ICMP_NE without wrapping, I think we could peel of an additional iteration and then the loop body could be simplified too. I've added that as a fixme and intend to add that as a follow up commit. I can also add it to this commit if you prefer.

mkazantsev added inline comments.Apr 9 2018, 2:26 AM

lib/Transforms/Utils/LoopUnrollPeel.cpp
227	How about checking it before you start calculating `NewPeelCount`?

Herald added a subscriber: zzheng. · View Herald TranscriptApr 9 2018, 2:26 AM

Move monotonic check to the beginning, thanks Max

LGTM

This revision is now accepted and ready to land.Apr 16 2018, 2:27 AM

Closed by commit rL330250: [LoopUnroll] Only peel if a predicate becomes known in the loop body. (authored by fhahn). · Explain WhyApr 18 2018, 5:32 AM

This revision was automatically updated to reflect the committed changes.

lebedev.ri mentioned this in D69617: [LoopUnroll] countToEliminateCompares(): fix handling of [in]equality predicates (PR43840).Oct 30 2019, 8:22 AM

Diff 140194

lib/Transforms/Utils/LoopUnrollPeel.cpp

Show First 20 Lines • Show All 189 Lines • ▼ Show 20 Lines	for (auto *BB : L.blocks()) {

const SCEVAddRecExpr *LeftAR = cast<SCEVAddRecExpr>(LeftSCEV);		const SCEVAddRecExpr *LeftAR = cast<SCEVAddRecExpr>(LeftSCEV);

// Avoid huge SCEV computations in the loop below and make sure we only		// Avoid huge SCEV computations in the loop below and make sure we only
// consider AddRecs of the loop we are trying to peel.		// consider AddRecs of the loop we are trying to peel.
if (!LeftAR->isAffine() \|\| LeftAR->getLoop() != &L)		if (!LeftAR->isAffine() \|\| LeftAR->getLoop() != &L)
continue;		continue;

// Check if extending DesiredPeelCount lets us evaluate Pred.		// Check if extending the current DesiredPeelCount lets us evaluate Pred
		// or !Pred in the loop body statically.
		unsigned NewPeelCount = DesiredPeelCount;
		mkazantsevUnsubmitted Not Done Reply Inline Actions BTW, I just realized that this entire optimization will be harmful if the predicate is provable for `LeftAR, RightSCEV`. In this case, something is true on EVERY iteration, and you peel out SOME iterations thinking that it would be profitable. Isn't it the real problem you are trying to win here? mkazantsev: BTW, I just realized that this entire optimization will be harmful if the predicate is provable…
		fhahnAuthorUnsubmitted Not Done Reply Inline Actions Are you referring where either Pred or !Pred is known for `LeftAr, RightSCEV` independently of the iteration? This case should be guarded against in line 176 fhahn: Are you referring where either Pred or !Pred is known for `LeftAr, RightSCEV` independently of…
		mkazantsevUnsubmitted Not Done Reply Inline Actions Right, I haven't noticed this. Thanks for pointing out! mkazantsev: Right, I haven't noticed this. Thanks for pointing out!

const SCEV *IterVal = LeftAR->evaluateAtIteration(		const SCEV *IterVal = LeftAR->evaluateAtIteration(
SE.getConstant(LeftSCEV->getType(), DesiredPeelCount), SE);		SE.getConstant(LeftSCEV->getType(), NewPeelCount), SE);

// If the original condition is not known, get the negated predicate		// If the original condition is not known, get the negated predicate
// (which holds on the else branch) and check if it is known. This allows		// (which holds on the else branch) and check if it is known. This allows
// us to peel of iterations that make the original condition false.		// us to peel of iterations that make the original condition false.
if (!SE.isKnownPredicate(Pred, IterVal, RightSCEV))		if (!SE.isKnownPredicate(Pred, IterVal, RightSCEV))
Pred = ICmpInst::getInversePredicate(Pred);		Pred = ICmpInst::getInversePredicate(Pred);

const SCEV *Step = LeftAR->getStepRecurrence(SE);		const SCEV *Step = LeftAR->getStepRecurrence(SE);
while (DesiredPeelCount < MaxPeelCount &&		while (NewPeelCount < MaxPeelCount &&
SE.isKnownPredicate(Pred, IterVal, RightSCEV)) {		SE.isKnownPredicate(Pred, IterVal, RightSCEV)) {
IterVal = SE.getAddExpr(IterVal, Step);		IterVal = SE.getAddExpr(IterVal, Step);
DesiredPeelCount++;		NewPeelCount++;
}		}

		// Only peel the loop if if !Pred becomes known in the first iteration of
		junbumlUnsubmitted Not Done Reply Inline Actions if Pred or !Pred junbuml: if Pred or !Pred
		fhahnAuthorUnsubmitted Not Done Reply Inline Actions In terms of the original Pred, `Pred or !Pred` is precise, but at this stage Pred is set so that it is known in the peeled part (it is set to the inverse predicate earlier, if the original Pred is not known). Therefore I think it is slightly clearer to refer to just !Pred here. What do you think? fhahn: In terms of the original Pred, `Pred or !Pred` is precise, but at this stage Pred is set so…
		// the loop body after peeling.
		if (SE.isKnownPredicate(ICmpInst::getInversePredicate(Pred), IterVal,
		mkazantsevUnsubmitted Not Done Reply Inline Actions Check that `DesiredPeelCount != NewPeelCount` before asking SCEV, this will save us some compile time in most cases. mkazantsev: Check that `DesiredPeelCount != NewPeelCount` before asking SCEV, this will save us some…
		fhahnAuthorUnsubmitted Not Done Reply Inline Actions Done, thanks! fhahn: Done, thanks!
		RightSCEV))
		DesiredPeelCount = NewPeelCount;
}		}

return DesiredPeelCount;		return DesiredPeelCount;
}		}

		mkazantsevUnsubmitted Not Done Reply Inline Actions How about checking it before you start calculating `NewPeelCount`? mkazantsev: How about checking it before you start calculating `NewPeelCount`?
// Return the number of iterations we want to peel off.		// Return the number of iterations we want to peel off.
void llvm::computePeelCount(Loop *L, unsigned LoopSize,		void llvm::computePeelCount(Loop *L, unsigned LoopSize,
TargetTransformInfo::UnrollingPreferences &UP,		TargetTransformInfo::UnrollingPreferences &UP,
unsigned &TripCount, ScalarEvolution &SE) {		unsigned &TripCount, ScalarEvolution &SE) {
assert(LoopSize > 0 && "Zero loop size is not allowed!");		assert(LoopSize > 0 && "Zero loop size is not allowed!");
UP.PeelCount = 0;		UP.PeelCount = 0;
if (!canPeel(L))		if (!canPeel(L))
return;		return;
▲ Show 20 Lines • Show All 420 Lines • Show Last 20 Lines

test/Transforms/LoopUnroll/peel-loop-conditions.ll

Show First 20 Lines • Show All 332 Lines • ▼ Show 20 Lines	for.inc:
%inc = add nsw i32 %i.05, 1		%inc = add nsw i32 %i.05, 1
%cmp = icmp slt i32 %inc, %k		%cmp = icmp slt i32 %inc, %k
br i1 %cmp, label %for.body, label %for.end		br i1 %cmp, label %for.body, label %for.end

for.end:		for.end:
ret void		ret void
}		}

; Test that we respect MaxPeelCount		; Test that we only peel off iterations if it simplifies a condition in the
		; loop body after peeling at most MaxPeelCount iterations.
		junbumlUnsubmitted Not Done Reply Inline Actions I think you should add another test to show the case where MaxPeelCount is respected. junbuml: I think you should add another test to show the case where MaxPeelCount is respected.
		fhahnAuthorUnsubmitted Not Done Reply Inline Actions I think this test case still checks that MaxPeelCount is respected. Without checking MaxPeelCount, we would peel off 9999 iterations. I have updated the comment and tried to highlight that fact. fhahn: I think this test case still checks that MaxPeelCount is respected. Without checking…
define void @test4(i32 %k) {		define void @test4(i32 %k) {
; CHECK-LABEL: @test4(		; CHECK-LABEL: @test4(
; CHECK-NEXT: for.body.lr.ph:		; CHECK-NEXT: for.body.lr.ph:
; CHECK-NEXT: br label [[FOR_BODY_PEEL_BEGIN:%.*]]
; CHECK: for.body.peel.begin:
; CHECK-NEXT: br label [[FOR_BODY_PEEL:%.*]]
; CHECK: for.body.peel:
; CHECK-NEXT: [[CMP1_PEEL:%.*]] = icmp ugt i32 0, 9999
; CHECK-NEXT: br i1 [[CMP1_PEEL]], label [[IF_THEN_PEEL:%.]], label [[FOR_INC_PEEL:%.]]
; CHECK: if.then.peel:
; CHECK-NEXT: call void @f1()
; CHECK-NEXT: br label [[FOR_INC_PEEL]]
; CHECK: for.inc.peel:
; CHECK-NEXT: [[INC_PEEL:%.*]] = add nsw i32 0, 1
; CHECK-NEXT: [[CMP_PEEL:%.]] = icmp slt i32 [[INC_PEEL]], [[K:%.]]
; CHECK-NEXT: br i1 [[CMP_PEEL]], label [[FOR_BODY_PEEL_NEXT:%.]], label [[FOR_END:%.]]
; CHECK: for.body.peel.next:
; CHECK-NEXT: br label [[FOR_BODY_PEEL2:%.*]]
; CHECK: for.body.peel2:
; CHECK-NEXT: [[CMP1_PEEL3:%.*]] = icmp ugt i32 [[INC_PEEL]], 9999
; CHECK-NEXT: br i1 [[CMP1_PEEL3]], label [[IF_THEN_PEEL4:%.]], label [[FOR_INC_PEEL5:%.]]
; CHECK: if.then.peel4:
; CHECK-NEXT: call void @f1()
; CHECK-NEXT: br label [[FOR_INC_PEEL5]]
; CHECK: for.inc.peel5:
; CHECK-NEXT: [[INC_PEEL6:%.*]] = add nsw i32 [[INC_PEEL]], 1
; CHECK-NEXT: [[CMP_PEEL7:%.*]] = icmp slt i32 [[INC_PEEL6]], [[K]]
; CHECK-NEXT: br i1 [[CMP_PEEL7]], label [[FOR_BODY_PEEL_NEXT1:%.*]], label [[FOR_END]]
; CHECK: for.body.peel.next1:
; CHECK-NEXT: br label [[FOR_BODY_PEEL9:%.*]]
; CHECK: for.body.peel9:
; CHECK-NEXT: [[CMP1_PEEL10:%.*]] = icmp ugt i32 [[INC_PEEL6]], 9999
; CHECK-NEXT: br i1 [[CMP1_PEEL10]], label [[IF_THEN_PEEL11:%.]], label [[FOR_INC_PEEL12:%.]]
; CHECK: if.then.peel11:
; CHECK-NEXT: call void @f1()
; CHECK-NEXT: br label [[FOR_INC_PEEL12]]
; CHECK: for.inc.peel12:
; CHECK-NEXT: [[INC_PEEL13:%.*]] = add nsw i32 [[INC_PEEL6]], 1
; CHECK-NEXT: [[CMP_PEEL14:%.*]] = icmp slt i32 [[INC_PEEL13]], [[K]]
; CHECK-NEXT: br i1 [[CMP_PEEL14]], label [[FOR_BODY_PEEL_NEXT8:%.*]], label [[FOR_END]]
; CHECK: for.body.peel.next8:
; CHECK-NEXT: br label [[FOR_BODY_PEEL16:%.*]]
; CHECK: for.body.peel16:
; CHECK-NEXT: [[CMP1_PEEL17:%.*]] = icmp ugt i32 [[INC_PEEL13]], 9999
; CHECK-NEXT: br i1 [[CMP1_PEEL17]], label [[IF_THEN_PEEL18:%.]], label [[FOR_INC_PEEL19:%.]]
; CHECK: if.then.peel18:
; CHECK-NEXT: call void @f1()
; CHECK-NEXT: br label [[FOR_INC_PEEL19]]
; CHECK: for.inc.peel19:
; CHECK-NEXT: [[INC_PEEL20:%.*]] = add nsw i32 [[INC_PEEL13]], 1
; CHECK-NEXT: [[CMP_PEEL21:%.*]] = icmp slt i32 [[INC_PEEL20]], [[K]]
; CHECK-NEXT: br i1 [[CMP_PEEL21]], label [[FOR_BODY_PEEL_NEXT15:%.*]], label [[FOR_END]]
; CHECK: for.body.peel.next15:
; CHECK-NEXT: br label [[FOR_BODY_PEEL23:%.*]]
; CHECK: for.body.peel23:
; CHECK-NEXT: [[CMP1_PEEL24:%.*]] = icmp ugt i32 [[INC_PEEL20]], 9999
; CHECK-NEXT: br i1 [[CMP1_PEEL24]], label [[IF_THEN_PEEL25:%.]], label [[FOR_INC_PEEL26:%.]]
; CHECK: if.then.peel25:
; CHECK-NEXT: call void @f1()
; CHECK-NEXT: br label [[FOR_INC_PEEL26]]
; CHECK: for.inc.peel26:
; CHECK-NEXT: [[INC_PEEL27:%.*]] = add nsw i32 [[INC_PEEL20]], 1
; CHECK-NEXT: [[CMP_PEEL28:%.*]] = icmp slt i32 [[INC_PEEL27]], [[K]]
; CHECK-NEXT: br i1 [[CMP_PEEL28]], label [[FOR_BODY_PEEL_NEXT22:%.*]], label [[FOR_END]]
; CHECK: for.body.peel.next22:
; CHECK-NEXT: br label [[FOR_BODY_PEEL30:%.*]]
; CHECK: for.body.peel30:
; CHECK-NEXT: [[CMP1_PEEL31:%.*]] = icmp ugt i32 [[INC_PEEL27]], 9999
; CHECK-NEXT: br i1 [[CMP1_PEEL31]], label [[IF_THEN_PEEL32:%.]], label [[FOR_INC_PEEL33:%.]]
; CHECK: if.then.peel32:
; CHECK-NEXT: call void @f1()
; CHECK-NEXT: br label [[FOR_INC_PEEL33]]
; CHECK: for.inc.peel33:
; CHECK-NEXT: [[INC_PEEL34:%.*]] = add nsw i32 [[INC_PEEL27]], 1
; CHECK-NEXT: [[CMP_PEEL35:%.*]] = icmp slt i32 [[INC_PEEL34]], [[K]]
; CHECK-NEXT: br i1 [[CMP_PEEL35]], label [[FOR_BODY_PEEL_NEXT29:%.*]], label [[FOR_END]]
; CHECK: for.body.peel.next29:
; CHECK-NEXT: br label [[FOR_BODY_PEEL37:%.*]]
; CHECK: for.body.peel37:
; CHECK-NEXT: [[CMP1_PEEL38:%.*]] = icmp ugt i32 [[INC_PEEL34]], 9999
; CHECK-NEXT: br i1 [[CMP1_PEEL38]], label [[IF_THEN_PEEL39:%.]], label [[FOR_INC_PEEL40:%.]]
; CHECK: if.then.peel39:
; CHECK-NEXT: call void @f1()
; CHECK-NEXT: br label [[FOR_INC_PEEL40]]
; CHECK: for.inc.peel40:
; CHECK-NEXT: [[INC_PEEL41:%.*]] = add nsw i32 [[INC_PEEL34]], 1
; CHECK-NEXT: [[CMP_PEEL42:%.*]] = icmp slt i32 [[INC_PEEL41]], [[K]]
; CHECK-NEXT: br i1 [[CMP_PEEL42]], label [[FOR_BODY_PEEL_NEXT36:%.*]], label [[FOR_END]]
; CHECK: for.body.peel.next36:
; CHECK-NEXT: br label [[FOR_BODY_PEEL_NEXT43:%.*]]
; CHECK: for.body.peel.next43:
; CHECK-NEXT: br label [[FOR_BODY_LR_PH_PEEL_NEWPH:%.*]]
; CHECK: for.body.lr.ph.peel.newph:
; CHECK-NEXT: br label [[FOR_BODY:%.*]]		; CHECK-NEXT: br label [[FOR_BODY:%.*]]
; CHECK: for.body:		; CHECK: for.body:
; CHECK-NEXT: [[I_05:%.]] = phi i32 [ [[INC_PEEL41]], [[FOR_BODY_LR_PH_PEEL_NEWPH]] ], [ [[INC:%.]], [[FOR_INC:%.*]] ]		; CHECK-NEXT: [[I_05:%.]] = phi i32 [ 0, [[FOR_BODY_LR_PH:%.]] ], [ [[INC:%.]], [[FOR_INC:%.]] ]
; CHECK-NEXT: [[CMP1:%.*]] = icmp ugt i32 [[I_05]], 9999		; CHECK-NEXT: [[CMP1:%.*]] = icmp ugt i32 [[I_05]], 9999
; CHECK-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]		; CHECK-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.*]], label [[FOR_INC]]
; CHECK: if.then:		; CHECK: if.then:
; CHECK-NEXT: call void @f1()		; CHECK-NEXT: call void @f1()
; CHECK-NEXT: br label [[FOR_INC]]		; CHECK-NEXT: br label [[FOR_INC]]
; CHECK: for.inc:		; CHECK: for.inc:
; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[I_05]], 1		; CHECK-NEXT: [[INC]] = add nsw i32 [[I_05]], 1
; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[INC]], [[K]]		; CHECK-NEXT: [[CMP:%.]] = icmp slt i32 [[INC]], [[K:%.]]
; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop !4		; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]
; CHECK: for.end.loopexit:
; CHECK-NEXT: br label [[FOR_END]]
; CHECK: for.end:		; CHECK: for.end:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
for.body.lr.ph:		for.body.lr.ph:
br label %for.body		br label %for.body

for.body:		for.body:
%i.05 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.inc ]		%i.05 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.inc ]
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	outer.inc:
%outer.cmp = icmp slt i32 %j.inc, %k		%outer.cmp = icmp slt i32 %j.inc, %k
br i1 %outer.cmp, label %outer.header, label %for.end		br i1 %outer.cmp, label %outer.header, label %for.end


for.end:		for.end:
ret void		ret void
}		}

		; In this test, the condition involves 2 AddRecs. Without evaluating both
		; AddRecs, we cannot prove that the condition becomes known in the loop body
		; after peeling.
define void @test6(i32 %k) {		define void @test6(i32 %k) {
		junbumlUnsubmitted Done Reply Inline Actions Better to add comment here. junbuml: Better to add comment here.
; CHECK-LABEL: @test6(		; CHECK-LABEL: @test6(
; CHECK-NEXT: entry:		; CHECK-NEXT: entry:
; CHECK-NEXT: br label [[FOR_BODY_PEEL_BEGIN:%.*]]
; CHECK: for.body.peel.begin:
; CHECK-NEXT: br label [[FOR_BODY_PEEL:%.*]]
; CHECK: for.body.peel:
; CHECK-NEXT: [[CMP1_PEEL:%.*]] = icmp ult i32 0, 4
; CHECK-NEXT: br i1 [[CMP1_PEEL]], label [[IF_THEN_PEEL:%.]], label [[IF_ELSE_PEEL:%.]]
; CHECK: if.else.peel:
; CHECK-NEXT: call void @f2()
; CHECK-NEXT: br label [[FOR_INC_PEEL:%.*]]
; CHECK: if.then.peel:
; CHECK-NEXT: call void @f1()
; CHECK-NEXT: br label [[FOR_INC_PEEL]]
; CHECK: for.inc.peel:
; CHECK-NEXT: [[INC_PEEL:%.*]] = add nsw i32 0, 2
; CHECK-NEXT: [[J_INC_PEEL:%.*]] = add nsw i32 4, 1
; CHECK-NEXT: [[CMP_PEEL:%.]] = icmp slt i32 [[INC_PEEL]], [[K:%.]]
; CHECK-NEXT: br i1 [[CMP_PEEL]], label [[FOR_BODY_PEEL_NEXT:%.]], label [[FOR_END:%.]]
; CHECK: for.body.peel.next:
; CHECK-NEXT: br label [[FOR_BODY_PEEL_NEXT1:%.*]]
; CHECK: for.body.peel.next1:
; CHECK-NEXT: br label [[ENTRY_PEEL_NEWPH:%.*]]
; CHECK: entry.peel.newph:
; CHECK-NEXT: br label [[FOR_BODY:%.*]]		; CHECK-NEXT: br label [[FOR_BODY:%.*]]
; CHECK: for.body:		; CHECK: for.body:
; CHECK-NEXT: [[I_05:%.]] = phi i32 [ [[INC_PEEL]], [[ENTRY_PEEL_NEWPH]] ], [ [[INC:%.]], [[FOR_INC:%.*]] ]		; CHECK-NEXT: [[I_05:%.]] = phi i32 [ 0, [[ENTRY:%.]] ], [ [[INC:%.]], [[FOR_INC:%.]] ]
; CHECK-NEXT: [[J:%.*]] = phi i32 [ [[INC_PEEL]], [[ENTRY_PEEL_NEWPH]] ], [ [[INC]], [[FOR_INC]] ]		; CHECK-NEXT: [[J:%.]] = phi i32 [ 4, [[ENTRY]] ], [ [[J_INC:%.]], [[FOR_INC]] ]
; CHECK-NEXT: br i1 false, label [[IF_THEN:%.]], label [[IF_ELSE:%.]]		; CHECK-NEXT: [[CMP1:%.*]] = icmp ult i32 [[I_05]], [[J]]
		; CHECK-NEXT: br i1 [[CMP1]], label [[IF_THEN:%.]], label [[IF_ELSE:%.]]
; CHECK: if.then:		; CHECK: if.then:
; CHECK-NEXT: call void @f1()		; CHECK-NEXT: call void @f1()
; CHECK-NEXT: br label [[FOR_INC]]		; CHECK-NEXT: br label [[FOR_INC]]
; CHECK: if.else:		; CHECK: if.else:
; CHECK-NEXT: call void @f2()		; CHECK-NEXT: call void @f2()
; CHECK-NEXT: br label [[FOR_INC]]		; CHECK-NEXT: br label [[FOR_INC]]
; CHECK: for.inc:		; CHECK: for.inc:
; CHECK-NEXT: [[INC]] = add nuw nsw i32 [[I_05]], 2		; CHECK-NEXT: [[INC]] = add nsw i32 [[I_05]], 2
; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[INC]], [[K]]		; CHECK-NEXT: [[J_INC]] = add nsw i32 [[J]], 1
; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop !5		; CHECK-NEXT: [[CMP:%.]] = icmp slt i32 [[INC]], [[K:%.]]
; CHECK: for.end.loopexit:		; CHECK-NEXT: br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END:%.*]]
; CHECK-NEXT: br label [[FOR_END]]
; CHECK: for.end:		; CHECK: for.end:
; CHECK-NEXT: ret void		; CHECK-NEXT: ret void
;		;
entry:		entry:
br label %for.body		br label %for.body

for.body:		for.body:
%i.05 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]		%i.05 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
%j = phi i32 [ 4, %entry ], [ %inc, %for.inc ]		%j = phi i32 [ 4, %entry ], [ %j.inc, %for.inc ]
%cmp1 = icmp ult i32 %i.05, %j		%cmp1 = icmp ult i32 %i.05, %j
br i1 %cmp1, label %if.then, label %if.else		br i1 %cmp1, label %if.then, label %if.else

if.then:		if.then:
call void @f1()		call void @f1()
br label %for.inc		br label %for.inc

if.else:		if.else:
Show All 12 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnroll] Only peel if a predicate becomes known in the loop body.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 140194

lib/Transforms/Utils/LoopUnrollPeel.cpp

test/Transforms/LoopUnroll/peel-loop-conditions.ll

This is an archive of the discontinued LLVM Phabricator instance.

[LoopUnroll] Only peel if a predicate becomes known in the loop body.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 140194

lib/Transforms/Utils/LoopUnrollPeel.cpp

test/Transforms/LoopUnroll/peel-loop-conditions.ll

[LoopUnroll] Only peel if a predicate becomes known in the loop body.
ClosedPublic