This is an archive of the discontinued LLVM Phabricator instance.

[LoopBoundSplit] Handle the case in which exiting block is loop header
ClosedPublic

Authored by jaykang10 on Sep 20 2021, 3:54 AM.

Details

Summary

If exiting block is loop header, phi node could be AddRecValue. Handle the case.

It is for fixing https://bugs.llvm.org/show_bug.cgi?id=51866.

Diff Detail

Event Timeline

jaykang10 created this revision.Sep 20 2021, 3:54 AM
jaykang10 requested review of this revision.Sep 20 2021, 3:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 20 2021, 3:54 AM
jaykang10 edited the summary of this revision. (Show Details)Sep 20 2021, 4:00 AM

I completely don't understand neither what happened nor the fix. Could you please give more details on how and why this bug is happening?

I completely don't understand neither what happened nor the fix. Could you please give more details on how and why this bug is happening?

Sorry for poor explanation... Let me try to explain something more.

We need to update the IV's start value in post loop. In the example from https://bugs.llvm.org/show_bug.cgi?id=51866, the IV's start value in post loop is not updated correctly as below.
%i.0.split = phi i16 [ %inc.split, %for.inc.split ], [ 0, %for.cond.split.preheader ].
As @uabelho mentioned, it should be [ 5, %for.cond.split.preheader ].

The code has expected the ExitingCond.AddRecValue is add instruction like %inc = add nuw nsw i16 %i.0, 1. The ExitingCond.AddRecValue is calculated from exiting block's condition. For example,

for.inc:
  %inc = add nuw nsw i64 %iv, 1
  %cond = icmp sgt i64 %inc, %n
  br i1 %cond, label %exit, label %loop

In above example, the latch is exiting block and the ExitingCond.AddRecValue is %inc = add nuw nsw i64 %iv, 1.

for.cond:                                         ; preds = %for.inc, %entry
  %i.0 = phi i16 [ 0, %entry ], [ %inc, %for.inc ]
  %exitcond.not = icmp eq i16 %i.0, 10
  br i1 %exitcond.not, label %for.end, label %for.body

In above example, the header is exiting block and the ExitingCond.AddRecValue is %i.0 = phi i16 [ 0, %entry ], [ %inc, %for.inc ]. Previously, the code missed this case.

This patch detects this case and update the IV's start value in post loop correctly. If you feel something wrong, please let me know.

mkazantsev added inline comments.Sep 21 2021, 9:18 PM
llvm/test/Transforms/LoopBoundSplit/bug51866.ll
64

Could you please add one more Phi next to it (starting from different value, say 10) and make sure it's also properly updated? I don't understand how it will be done.

jaykang10 added inline comments.Sep 22 2021, 12:51 AM
llvm/test/Transforms/LoopBoundSplit/bug51866.ll
64

You are right! Multiple IVs are not updated with this patch. Let me try to handle it. Thanks @mkazantsev

jaykang10 updated this revision to Diff 374169.Sep 22 2021, 3:03 AM

https://reviews.llvm.org/D110060 is merged into this patch.

Following the comment of @mkazantsev, updated patch.

  • If the Cond.AddRecValue is PHI node, update Cond.NonPHIAddRecValue with value from backedge.
  • Use Cond.NonPHIAddRecValue when the start value of IV in post loop is updated.
jaykang10 added inline comments.Sep 22 2021, 3:05 AM
llvm/test/Transforms/LoopBoundSplit/bug51866.ll
64

I have added %i.1 = phi i16 [ 10, %entry ], [ %inc, %for.inc ]. It is not updated because it is not SCEVAddRecExpr.

mkazantsev requested changes to this revision.Sep 22 2021, 11:18 PM
mkazantsev added inline comments.
llvm/lib/Transforms/Scalar/LoopBoundSplit.cpp
81

You can use L.getLoopPreheader or L.getLoopPredecessor to find a block from which a non-loop value is coming.

llvm/test/Transforms/LoopBoundSplit/bug51866.ll
40

Why is it 10 again?

This revision now requires changes to proceed.Sep 22 2021, 11:18 PM
jaykang10 added inline comments.Sep 23 2021, 12:37 AM
llvm/lib/Transforms/Scalar/LoopBoundSplit.cpp
81

Yep, let me update it.

llvm/test/Transforms/LoopBoundSplit/bug51866.ll
40

Ah, you are right! I made a mistake. The incoming value from preheader in post-loop should be new bound. Let me update it.

jaykang10 updated this revision to Diff 374470.Sep 23 2021, 1:25 AM

Following comments from @mkazantsev, updated patch.

mkazantsev requested changes to this revision.Sep 23 2021, 9:48 PM
mkazantsev added inline comments.
llvm/test/Transforms/LoopBoundSplit/bug51866.ll
40

And why is it 5?.. It should be 15, no?

This revision now requires changes to proceed.Sep 23 2021, 9:48 PM
jaykang10 added inline comments.Sep 23 2021, 11:46 PM
llvm/test/Transforms/LoopBoundSplit/bug51866.ll
40

um... The %i.1 is not AddRecExpr...

for.cond: 
  %i.0 = phi i16 [ 0, %entry ], [ %inc, %for.inc ]
  %i.1 = phi i16 [ 10, %entry ], [ %inc, %for.inc ]
...
for.inc:                                          ; preds = %if.else, %if.then
  %inc = add nuw nsw i16 %i.0, 1

The value of %i.1 will be 10 at first iteration and it will follow %inc at next iterations. In this case, the start value of %i.1 in post loop is 5. If %i.1 is AddRecExpr as below, the start value will be 15.

for.cond: 
  %i.0 = phi i16 [ 0, %entry ], [ %inc, %for.inc ]
  %i.1 = phi i16 [ 10, %entry ], [ %inc.1, %for.inc ]
...
for.inc:                                          ; preds = %if.else, %if.then
  %inc = add nuw nsw i16 %i.0, 1
  %inc.1 = add nuw nsw i16 %i.1, 1

I think you mention the case where the phi is AddRecExpr and the code is missing the case. Let me check and update the code. Thanks for the detailed review.

jaykang10 updated this revision to Diff 374780.Sep 24 2021, 3:01 AM

Update the incoming value of phi nodes in header of post-loop correctly.

mkazantsev added inline comments.Sep 24 2021, 7:41 PM
llvm/lib/Transforms/Scalar/LoopBoundSplit.cpp
412

This is fundamentally wrong for two reasons.

First - we should never base our logic on SCEV's ability to prove some fact, even if we know this for sure and SCEV was able to prove the very same thing in the past.

In particular - we should not even rely on (quite obvious) assumption that SCEV will always be able to prove that something that is increases by a constant has AddRecSCEV. It is likely to happen, but SCEV has *no guarantees* that it will surely prove things.

Second - and what about phis that are not addrecs? Do they not need update? For example, imagine code

for (i = 2; i < N; i = i * i) {...}

The phi here won't be an addrec. But it will still need an update when you change the iteration space, or at least I don't see any reason why it won't.

jaykang10 added inline comments.Sep 27 2021, 12:45 AM
llvm/lib/Transforms/Scalar/LoopBoundSplit.cpp
412

You are right. Let me think about how to handle the phis.

jaykang10 updated this revision to Diff 375276.Sep 27 2021, 8:47 AM

Update phi nodes in header of post-loop using LCSSA.

We can say that the start value of phi node in post-loop is the value of phi node at last iteration in pre-loop. Let's see a simple example.

for.body:
  %i = phi i16 [ 0, %entry ], [ %add, %cond.end ]
  %cmp1 = icmp ult i16 %i, 3
  br i1 %cmp1, label %cond.true, label %cond.false

cond.true:
  br label %cond.end

cond.false:
  br label %cond.end

cond.end:
  %call = call i16 @foo()
  %add = add nuw nsw i16 %i, 1
  %cmp2 = icmp ult i16 %i, 11
  br i1 %cmp2, label %for.body, label %end

In above example, we can add a LCSSA Phi node in exit block of pre-loop and update the start value of the phi node with the LCSSA one as below.

entry.split.split:
  %i.lcssa = phi i16 [ %add, %cond.end ]
...
for.body.split.preheader:
  br label %for.body.split
...
for.body.split:
  %i.split = phi i16 [ %add.split, %cond.end.split ], [ %i.lcssa, %for.body.split.preheader ]

In this example, the value of phi node at last iteration is %add because the exiting condition uses it.

If the exiting block is not loop latch, we need to update the phi node differently. Let's see other example.

for.cond:                                         ; preds = %for.inc, %entry
  %i.0 = phi i16 [ 0, %entry ], [ %inc.0, %for.inc ]
  %exitcond.not = icmp eq i16 %i.0, 10
  br i1 %exitcond.not, label %for.end, label %for.body

for.body:
  %cmp1 = icmp ult i16 %i.0, 5
  %arrayidx = getelementptr inbounds [10 x i16], [10 x i16]* @B, i16 0, i16 %i.0
  %0 = load i16, i16* %arrayidx, align 1
  br i1 %cmp1, label %if.then, label %if.else

if.then:
  br label %for.inc

if.else:
  br label %for.inc

for.inc:
  %inc.0 = add nuw nsw i16 %i.0, 1

For above example, we can add the LCSSA phi node as below.

entry.split.split:
  %i.0.lcssa = phi i16 [ %i.0, %for.cond ]
...
for.cond.split.preheader:
  br label %for.cond.split
...
for.cond.split:
  %i.0.split = phi i16 [ %inc.0.split, %for.inc.split ], [ %i.0.lcssa, %for.cond.split.preheader ]

In this example, the value of phi node at last iteration is %i.0 because the exiting condition uses it.

Based on this logic, the patch is updated. @mkazantsev If you feel something wrong, please let me know.

bjope resigned from this revision.Sep 27 2021, 1:43 PM
bjope added a subscriber: bjope.

Sorry for ping.

uabelho resigned from this revision.Sep 30 2021, 1:08 AM
uabelho added a subscriber: uabelho.
mkazantsev added inline comments.Oct 3 2021, 8:12 PM
llvm/lib/Transforms/Scalar/LoopBoundSplit.cpp
48

Please split out SCEV -> SCEVAddRecExpr refactoring in a separate NFC patch. It can go before the others.

66

Cast not needed.

66

Ignore this, I misread the code.

69

I guess this is impossible after you've ensured that AddRecSCEV is an addrec actually. Or can it be null?

69

Ignore this too.

391

What happens if there is more than one phi that will fit?

llvm/test/Transforms/LoopBoundSplit/bug51866.ll
40

Got it, I misread this IR and didn't notice the entry block was %inc. Thanks for clarification.

mkazantsev accepted this revision.Oct 3 2021, 8:13 PM

Look fine as far as I can tell, but please split out the SCEV->AddRec refactoring into a separate NFC patch to reduce the scope of change here.

This revision is now accepted and ready to land.Oct 3 2021, 8:13 PM
jaykang10 added inline comments.Oct 4 2021, 2:15 AM
llvm/lib/Transforms/Scalar/LoopBoundSplit.cpp
48

Yep, let me split this patch into two patches.

jaykang10 added inline comments.Oct 4 2021, 3:07 AM
llvm/lib/Transforms/Scalar/LoopBoundSplit.cpp
391

um... you are right. It will pick up just last phi node which could be wrong... Let me update the code.

jaykang10 updated this revision to Diff 376871.Oct 4 2021, 5:19 AM

Following comments of @mkazantsev, updated patch

  • Split the patch into two
  • Find PHI with exiting condition from pre-loop correctly
jaykang10 added inline comments.Oct 4 2021, 5:27 AM
llvm/lib/Transforms/Scalar/LoopBoundSplit.cpp
391

I have added code to check whether the phi node is SCEVAddRecExpr or not. After that, we can see below expected output.

; CHECK-NEXT:    [[EXITCOND_NOT_SPLIT:%.*]] = icmp eq i16 [[I_0_SPLIT]], 10

Previously, it was I_1_SPLIT intead of I_0_SPLIT.

The PHIs can have same incoming value from backedge but only one can be SCEVAddRecExpr as below.

for.cond:                                         ; preds = %for.inc, %entry
  %i.0 = phi i16 [ 0, %entry ], [ %inc.0, %for.inc ]
  %i.1 = phi i16 [ 10, %entry ], [ %inc.0, %for.inc ]
...
for.inc:                                          ; preds = %if.else, %if.then
  %inc.0 = add nuw nsw i16 %i.0, 1
...

@mkazantsev if you feel something wrong, please let me know. If it is correct, let me push this patch.