Page MenuHomePhabricator

[LoopReroll] Fix rerolling loop with extra instructions
Needs ReviewPublic

Authored by kawashima-fj on Sep 28 2020, 12:25 AM.

Details

Summary

Fixes PR47627

This fix correctly suppresses rerolling a loop which has unrerollable
instructions which use the induction variable but are not used by root
instructions, reduction instructions, or a loop-increment instruction.

Sample IR for the explanation below:

define void @foo([2 x i32]* nocapture %a) {
entry:
  br label %loop

loop:
  ; base instruction
  %indvar = phi i64 [ 0, %entry ], [ %indvar.next, %loop ]

  ; unrerollable instructions
  %stptrx = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %indvar, i64 0
  store i32 999, i32* %stptrx, align 4

  ; extra simple arithmetic operations, used by root instructions
  %plus20 = add nuw nsw i64 %indvar, 20
  %plus10 = add nuw nsw i64 %indvar, 10

  ; root instruction 0
  %ldptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 0
  %value0 = load i32, i32* %ldptr0, align 4
  %stptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 0
  store i32 %value0, i32* %stptr0, align 4

  ; root instruction 1
  %ldptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 1
  %value1 = load i32, i32* %ldptr1, align 4
  %stptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 1
  store i32 %value1, i32* %stptr1, align 4

  ; loop-increment and latch
  %indvar.next = add nuw nsw i64 %indvar, 1
  %exitcond = icmp eq i64 %indvar.next, 5
  br i1 %exitcond, label %exit, label %loop

exit:
  ret void
}

Before this fix, %indvar was appended to the LoopIncs vector in
the LoopReroll::DAGRootTracker::findRoots function. I think this is
because we want to mark two instructions with
extra simple arithmetic operations comment above as rerollable
(IL_All) at the end of the
LoopReroll::DAGRootTracker::collectUsedInstructions function,
as well as three instructions with the loop-increment and latch
comment. However, this caused two instruction with the
unrerollable instructions comment above to be marked as rerollable
too.

After this fix, %indvar is not appended to the LoopIncs vector.
Instead, two instructions with extra simple arithmetic operations
comment are marked as rerollable using their own collectInLoopUserSet
call. The unrerollable store instruction is rejected by
isSimpleArithmeticOp. isSimpleArithmeticOp is used in the
LoopReroll::DAGRootTracker::findRootsRecursive function to accept
inctructions with the extra simple arithmetic operations comment.
In this logic, %stptrx is also marked as rerolloable but is harmless.

See https://bugs.llvm.org/show_bug.cgi?id=47627 for more information.

Diff Detail

Event Timeline

kawashima-fj created this revision.Sep 28 2020, 12:25 AM
kawashima-fj requested review of this revision.Sep 28 2020, 12:25 AM

clang-format reported a format error DenseSet<Instruction*> -> DenseSet<Instruction *> (space before *). However, other three places in the same function (and many places in other functions) have the same format. Should I change all places? only my change?, or keep all them unchanged?

Ping.
I know many of LLVM developers are busy with the developer's meeting now. When someone has time, please review.

Ping.
Can anyone review this fix?

Thanks for all the effort you put into this and sorry you had to wait so long for somebody to look at this.

I don't think the problem here is the LoopIncs. The unrerollable instructions can as well depend on %iv.next. You patch would still reroll the following:

define void @unrerollable_simple([2 x i32]* nocapture %a) {
entry:
  br label %loop

loop:
  ; base instruction
  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]

  ; unrerollable instructions
  %iv.next = add nuw nsw i64 %iv, 1
  %stptrx = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %iv.next, i64 0
  store i32 999, i32* %stptrx, align 4

  ; extra simple arithmetic operations, used by root instructions
  %plus20 = add nuw nsw i64 %iv, 20
  %plus10 = add nuw nsw i64 %iv, 10

  ; root instruction 0
  %ldptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 0
  %value0 = load i32, i32* %ldptr0, align 4
  %stptr0 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 0
  store i32 %value0, i32* %stptr0, align 4

  ; root instruction 1
  %ldptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus20, i64 1
  %value1 = load i32, i32* %ldptr1, align 4
  %stptr1 = getelementptr inbounds [2 x i32], [2 x i32]* %a, i64 %plus10, i64 1
  store i32 %value1, i32* %stptr1, align 4

  ; loop-increment and latch
  ;%iv.next = add nuw nsw i64 %iv, 1
  %exitcond = icmp eq i64 %iv.next, 5
  br i1 %exitcond, label %exit, label %loop

exit:
  ret void
}

To me the problem looks like that the unrerollable store does not belong to any root set and is skipped during verification.

clang-format reported a format error DenseSet<Instruction*> -> DenseSet<Instruction *> (space before *). However, other three places in the same function (and many places in other functions) have the same format. Should I change all places? only my change?, or keep all them unchanged?

Use the clang-format suggestion. To goal is to eventually converge to a completely clang-formatted codebase.

llvm/lib/Transforms/Scalar/LoopRerollPass.cpp
1076

[style] LLVM's coding style does not make use of Almost-Always-Auto.

llvm/test/Transforms/LoopReroll/extra_instr.ll
12

[nit] There's usually a space before CHECK

@Meinersbur Thanks for your review. I'll update the patch.