This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Use fact that B >u 0 for A <u B in applyLoopGuards.
ClosedPublic

Authored by fhahn on May 26 2022, 2:51 PM.

Details

Summary

If LHS <u RHS holds, RHS should be guaranteed to be > 0. By using
using 'umax(RHS, 1) -1' instead of 'RHS - 1' the results in
applyLoopGuards can be improved in some cases.

Note that the TODO for the tests mentioned the max BTC being 11, but
unless I am missing something 10 should be correct.

https://alive2.llvm.org/ce/z/44nP7F

Diff Detail

Event Timeline

fhahn created this revision.May 26 2022, 2:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2022, 2:51 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
fhahn requested review of this revision.May 26 2022, 2:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2022, 2:51 PM

Will it still work if c1 and c2 are checked in the loop? To me it looks like there is some piece of logic is missing in isImpliedCondition.

fhahn added a comment.May 27 2022, 4:55 AM

@mkazantsev wrote:
Will it still work if c1 and c2 are checked in the loop?

Do you mean RHS & LHS? I'm not sure if a check in the loop would help us restrict the trip count, are you thinking about any case in particular?

To me it looks like there is some piece of logic is missing in isImpliedCondition.

I don't think we use isImpliedCond after constructing the SCEV expression for the backedge taken count and we also do not use it (at least with context) during SCEV construction. The UMax here helps to determine that the RHS - 1 won't wrap around, which in terms allows simplifying the BTC in some cases.

mkazantsev added a comment.EditedMay 29 2022, 9:31 PM

I was thinking of smth like

define void @test_chceks_in_loop(i32* nocapture %a, i64 %i, i64 %N) {
entry:
  br label loop

loop:
  %iv = phi i64 [ %iv.next, %loop.guarded.2 ], [ 0, %entry]
  %c.1 = icmp ult i64 %N, 12
  br i1 %c.1, label %loop.guarded.1, label %exit

loop.guarded.1:
  %c.2 = icmp ult i64 %i, %N
  br i1 %c.2, label %loop.guarded.2, label %exit

loop.guarded.2:
  %idx = getelementptr inbounds i32, i32* %a, i64 %iv
  store i32 1, i32* %idx, align 4
  %iv.next = add nuw nsw i64 %iv, 1
  %exitcond = icmp eq i64 %iv, %i
  br i1 %exitcond, label %exit, label %loop

exit:
  ret void
}

Normally I'd expect this checks would be hoisted/unswitched out, but if not, maybe it also makes sense to make it work.

llvm/lib/Analysis/ScalarEvolution.cpp
15065

I think there should be (maybe not such useful, but still) a symmetrical signed case with SINT_MIN+1 instead of 1.

mkazantsev accepted this revision.May 29 2022, 9:37 PM
mkazantsev added inline comments.
llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
397–398

I wonder, how come that AddRec is now proven to make just 1 iteration less, but BE taken count is reduced by 2? It's not necessarily a bug, but looks strange.

This revision is now accepted and ready to land.May 29 2022, 9:37 PM

Generally LGTM, but please think of examples I've provided, maybe there is a better solution.

fhahn updated this revision to Diff 493929.Feb 1 2023, 6:12 AM
fhahn marked an inline comment as done.

Rebased and fixed crash on pointer inductions.

Generally LGTM, but please think of examples I've provided, maybe there is a better solution.

Hmm, I think we could adjust the point from which we collect the conditions if it is used to compute the BTC for a specific branch. I'd need to check the compile-time impact and if it helps in practive much, as those branches should be hoisted out of the loop, as you said.

fhahn marked an inline comment as done.Feb 1 2023, 6:16 AM
fhahn added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
15065

I'll investigate as follow-up.

llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
397–398

The loop is guarded by %N < %I and %I < %N which cannot both be true, so it should never execute.

The reason we adjust the count by 2 is because we do the following substitutions

i = min(i, N-1)
N = min(N, min(I, N-1) -1)

nikic added inline comments.Feb 1 2023, 6:16 AM
llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
310–311

Remove TODO?

fhahn updated this revision to Diff 493939.Feb 1 2023, 7:19 AM
fhahn marked an inline comment as done.

Remove fixed TODOs, thanks! I plan to land this soon.

fhahn marked an inline comment as done.Feb 1 2023, 8:07 AM
This revision was landed with ongoing or failed builds.Feb 1 2023, 8:51 AM
This revision was automatically updated to reflect the committed changes.