This is an archive of the discontinued LLVM Phabricator instance.

[LoopReroll] Fix rerolling loop with extra instructions
ClosedPublic

Authored by kawashima-fj on Sep 28 2020, 12:25 AM.

Details

Summary

Fixes PR47627

This fix suppresses rerolling a loop which has an unrerollable
instruction.

Sample IR for the explanation below:

define void @foo([2 x i32]* nocapture %a) {
entry:
  br label %loop

loop:
  ; base instruction
  %indvar = phi i64 [ 0, %entry ], [ %indvar.next, %loop ]

  ; unrerollable instructions
  %stptrx = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %indvar, i64 0
  store i32 999, i32* %stptrx, align 4

  ; extra simple arithmetic operations, used by root instructions
  %plus20 = add nuw nsw i64 %indvar, 20
  %plus10 = add nuw nsw i64 %indvar, 10

  ; root instruction 0
  %ldptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 0
  %value0 = load i32, i32* %ldptr0, align 4
  %stptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 0
  store i32 %value0, i32* %stptr0, align 4

  ; root instruction 1
  %ldptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 1
  %value1 = load i32, i32* %ldptr1, align 4
  %stptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 1
  store i32 %value1, i32* %stptr1, align 4

  ; loop-increment and latch
  %indvar.next = add nuw nsw i64 %indvar, 1
  %exitcond = icmp eq i64 %indvar.next, 5
  br i1 %exitcond, label %exit, label %loop

exit:
  ret void
}

In the loop rerolling pass, %indvar and %indvar.next are appended
to the LoopIncs vector in the LoopReroll::DAGRootTracker::findRoots
function.

Before this fix, two instructions with unrerollable instructions
comment above are marked as IL_All at the end of the
LoopReroll::DAGRootTracker::collectUsedInstructions function,
as well as instructions with extra simple arithmetic operations
comment and loop-increment and latch comment. It is incorrect
because IL_All means that the instruction should be executed in all
iterations of the rerolled loop but the store instruction should
not.

This fix rejects instructions which may have side effects and don't
belong to def-use chains of any root instructions and reductions.

See https://bugs.llvm.org/show_bug.cgi?id=47627 for more information.

Diff Detail

Event Timeline

kawashima-fj created this revision.Sep 28 2020, 12:25 AM
kawashima-fj requested review of this revision.Sep 28 2020, 12:25 AM

clang-format reported a format error DenseSet<Instruction*> -> DenseSet<Instruction *> (space before *). However, other three places in the same function (and many places in other functions) have the same format. Should I change all places? only my change?, or keep all them unchanged?

Ping.
I know many of LLVM developers are busy with the developer's meeting now. When someone has time, please review.

Ping.
Can anyone review this fix?

Thanks for all the effort you put into this and sorry you had to wait so long for somebody to look at this.

I don't think the problem here is the LoopIncs. The unrerollable instructions can as well depend on %iv.next. You patch would still reroll the following:

define void @unrerollable_simple([2 x i32]* nocapture %a) {
entry:
  br label %loop

loop:
  ; base instruction
  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]

  ; unrerollable instructions
  %iv.next = add nuw nsw i64 %iv, 1
  %stptrx = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %iv.next, i64 0
  store i32 999, i32* %stptrx, align 4

  ; extra simple arithmetic operations, used by root instructions
  %plus20 = add nuw nsw i64 %iv, 20
  %plus10 = add nuw nsw i64 %iv, 10

  ; root instruction 0
  %ldptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 0
  %value0 = load i32, i32* %ldptr0, align 4
  %stptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 0
  store i32 %value0, i32* %stptr0, align 4

  ; root instruction 1
  %ldptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 1
  %value1 = load i32, i32* %ldptr1, align 4
  %stptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 1
  store i32 %value1, i32* %stptr1, align 4

  ; loop-increment and latch
  ;%iv.next = add nuw nsw i64 %iv, 1
  %exitcond = icmp eq i64 %iv.next, 5
  br i1 %exitcond, label %exit, label %loop

exit:
  ret void
}

To me the problem looks like that the unrerollable store does not belong to any root set and is skipped during verification.

clang-format reported a format error DenseSet<Instruction*> -> DenseSet<Instruction *> (space before *). However, other three places in the same function (and many places in other functions) have the same format. Should I change all places? only my change?, or keep all them unchanged?

Use the clang-format suggestion. To goal is to eventually converge to a completely clang-formatted codebase.

llvm/lib/Transforms/Scalar/LoopRerollPass.cpp
1076

[style] LLVM's coding style does not make use of Almost-Always-Auto.

llvm/test/Transforms/LoopReroll/extra_instr.ll
13

[nit] There's usually a space before CHECK

@Meinersbur Thanks for your review. I'll update the patch.

kawashima-fj edited the summary of this revision. (Show Details)

@Meinersbur I'm sorry for being late. I updated the patch. Please review again.

Looks much simpler now ;-)

I will accept after fixing the LLVM_DEBUG message.

It has a lot fewer tests now, any reason to remove some tests (to clirify: I tihnk the current 3 tests are sufficient)

After that, would you like me to push the patch to the repository's main branch? Potentially also include it in release 12.0.1?

llvm/lib/Transforms/Scalar/LoopRerollPass.cpp
1085

[typo] An instruction which does not belongs to any root must not have side effects

kawashima-fj marked an inline comment as done.

@Meinersbur Thanks for your review. I've updated the patch. Correction of LLVM_DEBUG message and adding test patterns.

I can push it to the GitHub main branch by myself. I don't know the criteria of pushing patch to the release branch. Should it be pushed?

kawashima-fj marked an inline comment as done.Apr 22 2021, 10:58 PM
Meinersbur accepted this revision.Apr 22 2021, 11:05 PM

I can push it to the GitHub main branch by myself. I don't know the criteria of pushing patch to the release branch. Should it be pushed?

You don't push yourself.

The protocol is that you close the bug report and fill the "fixed by commit" field, then add the bug as a release blocker of the 12.0.1 meta-bug.

This revision is now accepted and ready to land.Apr 22 2021, 11:05 PM
kawashima-fj closed this revision.Apr 23 2021, 12:39 AM

Pushed to the main branch. I'm sorry, I forgot to add Differential Revision: line. The commit is d9a9c992d190dd6645ea911b66cf0cadba0dadc3 .