This patch allow more conditional branches to be considered as loop guard, and so more loop nests can be considered perfect.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/Analysis/LoopNestAnalysis/imperfectnest.ll | ||
---|---|---|
428 | This test case should be considered as imperfect, as the user branch is not the inner loop guard. |
llvm/lib/Analysis/LoopInfo.cpp | ||
---|---|---|
405 | apart from checking that the control flow can be simplified, we also need to check to make sure that the chain of BBs don't contain unsafe instructions (or at least check that they are empty). | |
llvm/test/Analysis/LoopNestAnalysis/imperfectnest.ll | ||
428 | the user branch is behaving as a guard so we can allow it per the current definition of a loop guard. |
llvm/lib/Analysis/LoopInfo.cpp | ||
---|---|---|
401 | I thought about that too, but it seems to be questionable... if (cond) goto label; if (0 < N) { for (int i = 0; i < N; ++i) {...} label: } Should we consider if (0 < N) to be a loop guard? It actually guarded the loop, but it is not a single entry single exit region. If we decided to check for unique predecessor, it may make sense to do it in skipEmptyBlockUntil. |
llvm/lib/Analysis/LoopInfo.cpp | ||
---|---|---|
401 | I think we should keep the control-flow cases that are considered "guard-like" fairly simple (otherwise transforms will be faced with too many canonical forms to have to deal with). I'd say we do not consider if (0 < N) in the example above as a loop guard, given that label: is also being guarded by that condition. There may be legitimate use cases for the current semantics of skipEmptyBlockUntil, so I think we should create another version of that function (or pass a flag to it) to additionally check for existence of unique predecessors. Then we could use that version of the function here to decide whether the control flow structure should be considered a guard. @sidbav could you please do that? |
llvm/lib/Analysis/LoopInfo.cpp | ||
---|---|---|
401 | Yes, I intended on adding adding it, but I ran into LIT test issues so I did not put those changes in the patch.... Just took another look at it, and I realized I made a mistake in the initial implementation. It is working now. |
llvm/include/llvm/Analysis/LoopNestAnalysis.h | ||
---|---|---|
64–66 | Please update the description. | |
llvm/lib/Analysis/LoopInfo.cpp | ||
401 | @sidbav Can you please add a unit test in llvm/unittests/Analysis/LoopInfoTest.cpp for a test case like if (cond) goto label; if (0 < N) { for (int i = 0; i < N; ++i) {...} label: } where the branch if (0 < N) should be not a guard. |
The previous Solution to bug was more of a hack rather than a fix. This patch actually resolves the bug
I think this changes the semantics for the cases that we used to handle before. For example wouldn't this require the exit successors to be empty? If so I think lcssa phis can prevent us from detecting guards. We need to test for those cases as well.
int foo(char * restrict aa, int N) { int sum = 0; for (int i = 0; i < N; i++) { sum += aa[i]; } return sum; }
The corresponding IR would be:
define dso_local signext i32 @foo(i8* noalias %aa, i32 signext %N) #0 { entry: %cmp1 = icmp sgt i32 %N, 0 br i1 %cmp1, label %for.body.preheader, label %for.end for.body.preheader: ; preds = %entry %wide.trip.count = zext i32 %N to i64 br label %for.body for.body: ; preds = %for.body.preheader, %for.body %indvars.iv = phi i64 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ] %sum.02 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ] %arrayidx = getelementptr inbounds i8, i8* %aa, i64 %indvars.iv %0 = load i8, i8* %arrayidx, align 1, !tbaa !2 %conv = zext i8 %0 to i32 %add = add nuw nsw i32 %sum.02, %conv %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1 %exitcond = icmp ne i64 %indvars.iv.next, %wide.trip.count br i1 %exitcond, label %for.body, label %for.end.loopexit, !llvm.loop !5 for.end.loopexit: ; preds = %for.body %add.lcssa = phi i32 [ %add, %for.body ] br label %for.end for.end: ; preds = %for.end.loopexit, %entry %sum.0.lcssa = phi i32 [ 0, %entry ], [ %add.lcssa, %for.end.loopexit ] ret i32 %sum.0.lcssa }
As you can see for.end is not empty and so this loop's guard won't be detected!
As you can see for.end is not empty and so this loop's guard won't be detected!
From looking at skipEmptyBlockUntil, End is not require to be empty, so skipEmptyBlockUntil(for.end.loopexit, for.end) would return for.end.
llvm/lib/Analysis/LoopInfo.cpp | ||
---|---|---|
383–384 | These 3 lines can be removed. | |
llvm/lib/Analysis/LoopNestAnalysis.cpp | ||
223–224 | Should change this to const BasicBlock *PredBB = From; |
True, although you can have another bb in between that has code, and then we won't be able to detect the guard. I don't think we would want to limit our ability to detect a guard when there is code in the epilogue. There is also the question of what "code" is considered neutral. Perhaps we need a way to distinguish between guards that are "dedicated" and those that are not.
So if I understand correctly, we still allow code in the exit block and the other target of the guard branch like before, so this is purely an improvement over what we had before. Although, as I said before I think we should allow non-empty blocks as well, but that can come as future extensions.
llvm/include/llvm/Analysis/LoopNestAnalysis.h | ||
---|---|---|
65 | [nit] has an unique -> has a unique |
Addressed review comment, and fixed a bug in detecting perfect nest using skipEmptyBlockUntil.
[nit] has an unique -> has a unique