This is an archive of the discontinued LLVM Phabricator instance.

[LoopNest] Consider loop nest with inner loop guard using outer loop induction variable to be perfect
ClosedPublic

Authored by Whitney on Jan 14 2021, 1:37 PM.

Details

Summary

This patch allow more conditional branches to be considered as loop guard, and so more loop nests can be considered perfect.

Diff Detail

Event Timeline

sidbav created this revision.Jan 14 2021, 1:37 PM
sidbav requested review of this revision.Jan 14 2021, 1:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 14 2021, 1:37 PM
Whitney added inline comments.Jan 14 2021, 2:00 PM
llvm/test/Analysis/LoopNestAnalysis/imperfectnest.ll
428

This test case should be considered as imperfect, as the user branch is not the inner loop guard.

bmahjour requested changes to this revision.Apr 28 2021, 1:19 PM
bmahjour added inline comments.
llvm/lib/Analysis/LoopInfo.cpp
405

apart from checking that the control flow can be simplified, we also need to check to make sure that the chain of BBs don't contain unsafe instructions (or at least check that they are empty).

llvm/test/Analysis/LoopNestAnalysis/imperfectnest.ll
428

the user branch is behaving as a guard so we can allow it per the current definition of a loop guard.

This revision now requires changes to proceed.Apr 28 2021, 1:19 PM
sidbav updated this revision to Diff 341557.Apr 29 2021, 9:59 AM

Ensure the chain of BBs are empty blocks.

sidbav marked 3 inline comments as done.Apr 29 2021, 9:59 AM
Whitney accepted this revision.Apr 29 2021, 10:43 AM
bmahjour added inline comments.Apr 29 2021, 12:02 PM
llvm/lib/Analysis/LoopInfo.cpp
398

[nit] this comment needs to be reworded to be more clear.

401

AFAICS skipEmptyBlockUntil doesn't check that the unique successor blocks have unique predecessors. Don't we need to check for that too?

Whitney added inline comments.Apr 29 2021, 12:16 PM
llvm/lib/Analysis/LoopInfo.cpp
401

I thought about that too, but it seems to be questionable...

if (cond)
  goto label;
if (0 < N) {
  for (int i = 0; i < N; ++i) {...}
label:  
}

Should we consider if (0 < N) to be a loop guard? It actually guarded the loop, but it is not a single entry single exit region.

If we decided to check for unique predecessor, it may make sense to do it in skipEmptyBlockUntil.

bmahjour added inline comments.Apr 29 2021, 1:01 PM
llvm/lib/Analysis/LoopInfo.cpp
401

I think we should keep the control-flow cases that are considered "guard-like" fairly simple (otherwise transforms will be faced with too many canonical forms to have to deal with). I'd say we do not consider if (0 < N) in the example above as a loop guard, given that label: is also being guarded by that condition.

There may be legitimate use cases for the current semantics of skipEmptyBlockUntil, so I think we should create another version of that function (or pass a flag to it) to additionally check for existence of unique predecessors. Then we could use that version of the function here to decide whether the control flow structure should be considered a guard. @sidbav could you please do that?

sidbav updated this revision to Diff 341708.Apr 29 2021, 4:38 PM

Modify skipEmptyBlockUntil to also consider unique predecessors.

sidbav marked 4 inline comments as done.Apr 29 2021, 4:42 PM
sidbav added inline comments.
llvm/lib/Analysis/LoopInfo.cpp
401

Yes, I intended on adding adding it, but I ran into LIT test issues so I did not put those changes in the patch.... Just took another look at it, and I realized I made a mistake in the initial implementation. It is working now.

Whitney added inline comments.Apr 29 2021, 4:48 PM
llvm/include/llvm/Analysis/LoopNestAnalysis.h
64–66

Please update the description.

llvm/lib/Analysis/LoopInfo.cpp
401

@sidbav Can you please add a unit test in llvm/unittests/Analysis/LoopInfoTest.cpp for a test case like

if (cond)
  goto label;
if (0 < N) {
  for (int i = 0; i < N; ++i) {...}
label:  
}

where the branch if (0 < N) should be not a guard.

bmahjour added inline comments.Apr 30 2021, 8:14 AM
llvm/include/llvm/Analysis/LoopNestAnalysis.h
69

[nit] rename UniquePred -> CheckUniquePred.

llvm/lib/Analysis/LoopInfo.cpp
403

[nit] rename UniquePred -> CheckUniquePred.

llvm/lib/Analysis/LoopNestAnalysis.cpp
210

[nit] rename UniquePred -> CheckUniquePred.

sidbav updated this revision to Diff 342044.Apr 30 2021, 2:32 PM

address review comments

sidbav marked 5 inline comments as done.Apr 30 2021, 2:33 PM
Whitney accepted this revision.Apr 30 2021, 2:35 PM
bmahjour accepted this revision.Apr 30 2021, 2:39 PM
This revision is now accepted and ready to land.Apr 30 2021, 2:39 PM
sidbav updated this revision to Diff 342051.Apr 30 2021, 2:51 PM

The previous Solution to bug was more of a hack rather than a fix. This patch actually resolves the bug

Whitney accepted this revision.Apr 30 2021, 2:52 PM
bmahjour requested changes to this revision.Apr 30 2021, 3:10 PM

I think this changes the semantics for the cases that we used to handle before. For example wouldn't this require the exit successors to be empty? If so I think lcssa phis can prevent us from detecting guards. We need to test for those cases as well.

This revision now requires changes to proceed.Apr 30 2021, 3:10 PM

I think this changes the semantics for the cases that we used to handle before. For example wouldn't this require the exit successors to be empty? If so I think lcssa phis can prevent us from detecting guards. We need to test for those cases as well.

LCSSA phis should be in the loop exit block, not exit successors.

I think this changes the semantics for the cases that we used to handle before. For example wouldn't this require the exit successors to be empty? If so I think lcssa phis can prevent us from detecting guards. We need to test for those cases as well.

LCSSA phis should be in the loop exit block, not exit successors.

int foo(char * restrict aa, int N) {
  int sum = 0;
  for (int i = 0; i < N; i++) {
    sum += aa[i];
  }
  return sum;
}

The corresponding IR would be:

define dso_local signext i32 @foo(i8* noalias %aa, i32 signext %N) #0 {
entry:
  %cmp1 = icmp sgt i32 %N, 0
  br i1 %cmp1, label %for.body.preheader, label %for.end

for.body.preheader:                               ; preds = %entry
  %wide.trip.count = zext i32 %N to i64
  br label %for.body

for.body:                                         ; preds = %for.body.preheader, %for.body
  %indvars.iv = phi i64 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
  %sum.02 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ]
  %arrayidx = getelementptr inbounds i8, i8* %aa, i64 %indvars.iv
  %0 = load i8, i8* %arrayidx, align 1, !tbaa !2
  %conv = zext i8 %0 to i32
  %add = add nuw nsw i32 %sum.02, %conv
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
  %exitcond = icmp ne i64 %indvars.iv.next, %wide.trip.count
  br i1 %exitcond, label %for.body, label %for.end.loopexit, !llvm.loop !5

for.end.loopexit:                                 ; preds = %for.body
  %add.lcssa = phi i32 [ %add, %for.body ]
  br label %for.end

for.end:                                          ; preds = %for.end.loopexit, %entry
  %sum.0.lcssa = phi i32 [ 0, %entry ], [ %add.lcssa, %for.end.loopexit ]
  ret i32 %sum.0.lcssa
}

As you can see for.end is not empty and so this loop's guard won't be detected!

As you can see for.end is not empty and so this loop's guard won't be detected!

From looking at skipEmptyBlockUntil, End is not require to be empty, so skipEmptyBlockUntil(for.end.loopexit, for.end) would return for.end.

llvm/lib/Analysis/LoopInfo.cpp
383–384

These 3 lines can be removed.

llvm/lib/Analysis/LoopNestAnalysis.cpp
223–224

Should change this to const BasicBlock *PredBB = From;

As you can see for.end is not empty and so this loop's guard won't be detected!

From looking at skipEmptyBlockUntil, End is not require to be empty, so skipEmptyBlockUntil(for.end.loopexit, for.end) would return for.end.

True, although you can have another bb in between that has code, and then we won't be able to detect the guard. I don't think we would want to limit our ability to detect a guard when there is code in the epilogue. There is also the question of what "code" is considered neutral. Perhaps we need a way to distinguish between guards that are "dedicated" and those that are not.

Whitney updated this revision to Diff 343184.May 5 2021, 1:57 PM
Whitney edited the summary of this revision. (Show Details)

Addressed review comments.

bmahjour accepted this revision.May 5 2021, 2:14 PM

So if I understand correctly, we still allow code in the exit block and the other target of the guard branch like before, so this is purely an improvement over what we had before. Although, as I said before I think we should allow non-empty blocks as well, but that can come as future extensions.

llvm/include/llvm/Analysis/LoopNestAnalysis.h
65

[nit] has an unique -> has a unique

This revision is now accepted and ready to land.May 5 2021, 2:14 PM

So if I understand correctly, we still allow code in the exit block and the other target of the guard branch like before, so this is purely an improvement over what we had before. Although, as I said before I think we should allow non-empty blocks as well, but that can come as future extensions.

Yes, our understanding are the same.

Whitney updated this revision to Diff 343275.May 5 2021, 7:44 PM

Addressed review comment, and fixed a bug in detecting perfect nest using skipEmptyBlockUntil.

sidbav accepted this revision.May 6 2021, 5:42 PM

LGTM