Page MenuHomePhabricator

[SCEV] Use fact that B >u 0 for A <u B in applyLoopGuards.
AcceptedPublic

Authored by fhahn on May 26 2022, 2:51 PM.

Details

Summary

If LHS <u RHS holds, RHS should be guaranteed to be > 0. By using
using 'umax(RHS, 1) -1' instead of 'RHS - 1' the results in
applyLoopGuards can be improved in some cases.

Note that the TODO for the tests mentioned the max BTC being 11, but
unless I am missing something 10 should be correct.

https://alive2.llvm.org/ce/z/44nP7F

Diff Detail

Event Timeline

fhahn created this revision.May 26 2022, 2:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2022, 2:51 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
fhahn requested review of this revision.May 26 2022, 2:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2022, 2:51 PM

Will it still work if c1 and c2 are checked in the loop? To me it looks like there is some piece of logic is missing in isImpliedCondition.

fhahn added a comment.May 27 2022, 4:55 AM

@mkazantsev wrote:
Will it still work if c1 and c2 are checked in the loop?

Do you mean RHS & LHS? I'm not sure if a check in the loop would help us restrict the trip count, are you thinking about any case in particular?

To me it looks like there is some piece of logic is missing in isImpliedCondition.

I don't think we use isImpliedCond after constructing the SCEV expression for the backedge taken count and we also do not use it (at least with context) during SCEV construction. The UMax here helps to determine that the RHS - 1 won't wrap around, which in terms allows simplifying the BTC in some cases.

mkazantsev added a comment.EditedMay 29 2022, 9:31 PM

I was thinking of smth like

define void @test_chceks_in_loop(i32* nocapture %a, i64 %i, i64 %N) {
entry:
  br label loop

loop:
  %iv = phi i64 [ %iv.next, %loop.guarded.2 ], [ 0, %entry]
  %c.1 = icmp ult i64 %N, 12
  br i1 %c.1, label %loop.guarded.1, label %exit

loop.guarded.1:
  %c.2 = icmp ult i64 %i, %N
  br i1 %c.2, label %loop.guarded.2, label %exit

loop.guarded.2:
  %idx = getelementptr inbounds i32, i32* %a, i64 %iv
  store i32 1, i32* %idx, align 4
  %iv.next = add nuw nsw i64 %iv, 1
  %exitcond = icmp eq i64 %iv, %i
  br i1 %exitcond, label %exit, label %loop

exit:
  ret void
}

Normally I'd expect this checks would be hoisted/unswitched out, but if not, maybe it also makes sense to make it work.

llvm/lib/Analysis/ScalarEvolution.cpp
14481

I think there should be (maybe not such useful, but still) a symmetrical signed case with SINT_MIN+1 instead of 1.

mkazantsev accepted this revision.May 29 2022, 9:37 PM
mkazantsev added inline comments.
llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
386

I wonder, how come that AddRec is now proven to make just 1 iteration less, but BE taken count is reduced by 2? It's not necessarily a bug, but looks strange.

This revision is now accepted and ready to land.May 29 2022, 9:37 PM

Generally LGTM, but please think of examples I've provided, maybe there is a better solution.