This is an archive of the discontinued LLVM Phabricator instance.

[ScalarEvolution] Improve high cost heuristic in SCEVExpander.
Needs ReviewPublic

Authored by bevinh on Dec 7 2017, 1:28 AM.

Details

Summary

SCEVExpander previously considered all native integer UDivs
by a power of two to be cheap, regardless of the LHS. This
heuristic is rather permissive; the expression on the left
could be incredibly expensive.

This patch has SCEVExpander verify the cost of the LHS as
well. One regression test changed because of this, as the
cost of expanding a SCEV went from low to high.

Diff Detail

Event Timeline

bevinh created this revision.Dec 7 2017, 1:28 AM
bevinh updated this revision to Diff 125911.Dec 7 2017, 2:04 AM

Removed the x86 triple from the tests.

bjope added a subscriber: bjope.Dec 7 2017, 2:14 AM
Ka-Ka added a subscriber: Ka-Ka.Dec 7 2017, 3:36 AM
sanjoy edited edge metadata.Dec 7 2017, 11:19 PM
sanjoy added a subscriber: mzolotukhin.

lgtm to me, but I want Michael to take a look at the test update, as mentioned inline.

test/Analysis/ScalarEvolution/expensive-expansion.ll
40

Is this necessary?

test/Transforms/IndVarSimplify/no-iv-rewrite.ll
228–230

@mzolotukhin - can you please review this bit (that the test update is fine)?

bevinh added inline comments.Dec 8 2017, 1:08 AM
test/Analysis/ScalarEvolution/expensive-expansion.ll
40

When I was constructing the reproducer, I couldn't get loop-unroll to unroll the loop if it wasn't present.

Looking closer now, it seems that removing the target triple and the target specific attributes is preventing the unroll from happening even if I undo the patch. I'll see if I can get it to work again.

bevinh updated this revision to Diff 126100.Dec 8 2017, 1:28 AM

Enabled runtime loop unroll and added i32 as a legal integer type to prevent SCEVExpander from considering the trip count expensive due to that.

bjope added inline comments.Dec 18 2017, 10:23 AM
test/Transforms/IndVarSimplify/no-iv-rewrite.ll
228–230

Here is some more input to the review (I promised Bevin that I would keep an eye on this during his vacation).
With this patch I get:

define i64 @cloneOr(i32 %limit, i64* %base) #0 {
entry:
  %halfLim = ashr i32 %limit, 2
  %0 = sext i32 %halfLim to i64
  br label %loop

loop:                                             ; preds = %loop, %entry
  %indvars.iv = phi i64 [ %indvars.iv.next, %loop ], [ 0, %entry ]
  %adr = getelementptr i64, i64* %base, i64 %indvars.iv
  %val = load i64, i64* %adr
  %1 = or i64 %indvars.iv, 1
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 2
  %cmp = icmp slt i64 %indvars.iv.next, %0
  br i1 %cmp, label %loop, label %exit

exit:                                             ; preds = %loop
  %val.lcssa = phi i64 [ %val, %loop ]
  %t3.lcssa = phi i64 [ %1, %loop ]
  %result = and i64 %val.lcssa, %t3.lcssa
  ret i64 %result
}

Without the patch:

define i64 @cloneOr(i32 %limit, i64* %base) #0 {
entry:
  %halfLim = ashr i32 %limit, 2
  %0 = sext i32 %halfLim to i64
  %1 = icmp sgt i32 %halfLim, 2
  %smax = select i1 %1, i32 %halfLim, i32 2
  %2 = add i32 %smax, -1
  %3 = lshr i32 %2, 1
  %4 = zext i32 %3 to i64
  %5 = shl i64 %4, 1
  br label %loop

loop:                                             ; preds = %loop, %entry
  %indvars.iv = phi i64 [ %indvars.iv.next, %loop ], [ 0, %entry ]
  %adr = getelementptr i64, i64* %base, i64 %indvars.iv
  %val = load i64, i64* %adr
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 2
  %cmp = icmp slt i64 %indvars.iv.next, %0
  br i1 %cmp, label %loop, label %exit

exit:                                             ; preds = %loop
  %val.lcssa = phi i64 [ %val, %loop ]
  %6 = add i64 %5, 1
  %result = and i64 %val.lcssa, %6
  ret i64 %result
}

It is not completely obvious to me which one is better (at least I find it hard to say that this test case shows that this patch is good).
Not having the or inside the loop is probably good in most cases (at least when iteration count is high). If iteration count is low, then the calculation of %result is simpler when we leave the or inside the loop.

sanjoy resigned from this revision.Jan 29 2022, 5:32 PM