Page MenuHomePhabricator

[LoopInterchange] Disallow interchange when memory accesses are guarded by control flow (PR48057)
Needs ReviewPublic

Authored by TaWeiTu on Mar 2 2021, 8:29 PM.

Details

Reviewers
fhahn
Whitney
Summary

If instructions that have side effects are guarded by control flow,
the order of execution might differ after interchanging the loops
and results in incorrect behaviour (the branch condition may depends
on the execution order).

This fixes https://bugs.llvm.org/show_bug.cgi?id=48057

Diff Detail

Event Timeline

TaWeiTu created this revision.Mar 2 2021, 8:29 PM
TaWeiTu requested review of this revision.Mar 2 2021, 8:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2021, 8:29 PM
TaWeiTu updated this revision to Diff 327658.Mar 2 2021, 8:32 PM
This comment was removed by TaWeiTu.
TaWeiTu updated this revision to Diff 327662.Mar 2 2021, 8:42 PM
This comment was removed by TaWeiTu.
TaWeiTu retitled this revision from [LoopInterchange] Disallow interchange when memory accesses are guarded by control flow to [LoopInterchange] Disallow interchange when memory accesses are guarded by control flow (PR48057).Mar 2 2021, 9:06 PM
Harbormaster completed remote builds in B91723: Diff 327658.
TaWeiTu updated this revision to Diff 327836.Mar 3 2021, 9:52 AM
  • Remove unnecessary lines in the test case
This comment was removed by Whitney.
fhahn added a comment.Mar 5 2021, 9:23 AM

Is this fixing a similar issue to D91228 /D87879?

Is this fixing a similar issue to D91228 /D87879?

Yes, the intention is the same.
I didn't know about these patches before, and I just test the reproducer PR47523 (the one D87879 is trying to address) against my patch, and the issue is indeed resolved (D91228 is addressing the exact same PR as this one does).

The following test in lcssa.ll is failing:

define void @lcssa_05(i32* %ptr) {
entry:
  br label %outer.header

outer.header:                                     ; preds = %outer.inc, %entry
  %iv.outer = phi i64 [ 1, %entry ], [ %iv.outer.next, %outer.inc ]
  br label %for.body3

for.body3:                                        ; preds = %bb3, %outer.header
  %iv.inner = phi i64 [ %iv.inner.next, %bb3 ], [ 1, %outer.header ]
  br i1 undef, label %bb2, label %bb3

bb2:                                              ; preds = %for.body3
  %arrayidx5 = getelementptr inbounds [100 x [100 x i32]], [100 x [100 x i32]]* @A, i64 0, i64 %iv.inner, i64 %iv.outer
  %vA = load i32, i32* %arrayidx5
  %arrayidx9 = getelementptr inbounds [100 x [100 x i32]], [100 x [100 x i32]]* @C, i64 0, i64 %iv.inner, i64 %iv.outer
  %vC = load i32, i32* %arrayidx9
  %add = add nsw i32 %vA, %vC
  br label %bb3

bb3:                                              ; preds = %bb2, %for.body3
  %addp = phi i32 [ %add, %bb2 ], [ 0, %for.body3 ]
  store i32 %addp, i32* %ptr
  %iv.inner.next = add nuw nsw i64 %iv.inner, 1
  %exitcond = icmp eq i64 %iv.inner.next, 100
  br i1 %exitcond, label %outer.inc, label %for.body3

outer.inc:                                        ; preds = %bb3
  %iv.inner.lcssa = phi i64 [ %iv.inner, %bb3 ]
  %iv.outer.next = add nsw i64 %iv.outer, 1
  %cmp = icmp eq i64 %iv.outer.next, 100
  br i1 %cmp, label %outer.header, label %for.exit

for.exit:                                         ; preds = %outer.inc
  %iv.inner.lcssa.lcssa = phi i64 [ %iv.inner.lcssa, %outer.inc ]
  store i64 %iv.inner.lcssa.lcssa, i64* @Y
  br label %for.end16

for.end16:                                        ; preds = %for.exit
  ret void
}

The test is expecting the loops to be interchanged, but judging from br i1 undef, label %bb2, label %bb3 I think the interchange should probabily be considered invalid in this case?

TaWeiTu updated this revision to Diff 328603.Mar 5 2021, 11:44 AM
  • Add testcase of PR47523
Whitney added inline comments.Mar 7 2021, 7:19 AM
llvm/lib/Transforms/Scalar/LoopInterchange.cpp
1021
TaWeiTu updated this revision to Diff 328863.Mar 7 2021, 7:32 AM
  • Use LoopInfo::getLoopFor
Whitney added inline comments.Mar 7 2021, 7:44 AM
llvm/lib/Transforms/Scalar/LoopInterchange.cpp
1018

In what scenario, outer loop contains instructions guarded by control flow?
(I assume loop interchange operate on tight loops.)

TaWeiTu updated this revision to Diff 328873.Mar 7 2021, 8:41 AM
  • Only check basic blocks in the inner loop
TaWeiTu marked 2 inline comments as done.Mar 7 2021, 8:41 AM
TaWeiTu added inline comments.
llvm/lib/Transforms/Scalar/LoopInterchange.cpp
1018

Thanks for pointing that out! I think you are right.

Have you considered the case where the blocks guarded by control flow doesn't have memory operation, but affect memory operation through phi?

bb1:
  br %cond, bb2, bb3
bb2:
  br bb4
bb3:
  br bb4
bb4:
  %t = phi [ %t2, %bb2 ], [ %t3, %bb3 ]
  store %t, %p
TaWeiTu updated this revision to Diff 332204.Mar 21 2021, 11:39 PM

Consider instructions that are indirectly affected by the control flow

Have you considered the case where the blocks guarded by control flow doesn't have memory operation, but affect memory operation through phi?

bb1:
  br %cond, bb2, bb3
bb2:
  br bb4
bb3:
  br bb4
bb4:
  %t = phi [ %t2, %bb2 ], [ %t3, %bb3 ]
  store %t, %p

I've updated the patch to perform a more comprehensive check that consider instructions that are indirectly guarded by the control flow.

Gentle ping

@TaWeiTu How do you want to handle the lcssa.ll failure? Maybe we can only disallow memory accesses that are guarded by loop variant condition?

TaWeiTu updated this revision to Diff 335988.Apr 7 2021, 8:46 PM
  • Skip loop-invariant-guarded blocks

@TaWeiTu How do you want to handle the lcssa.ll failure? Maybe we can only disallow memory accesses that are guarded by loop variant condition?

Now only loop-variant-guarded blocks are considered "bad". lcssa.ll should be fixed.
We should probably also add logic to check whether the orders of memory accesses actually change after interchanging the loop, but I think it's beyond the scope of this patch and can be improved in the future.

Whitney added inline comments.Apr 8 2021, 9:08 AM
llvm/lib/Transforms/Scalar/LoopInterchange.cpp
958
  br cond, BB1, BB2
BB1:
  br BB3
BB2
  br BB3

BB3 should not guarded by a loop variant block, but this function would return BB1, BB2, and BB3.