this code, only evolaute node that cannot be resolved as SCEVAddRec.
so now emits correct Loop Exits and Disposition with following IR.
define void @f() { entry: br label %loop loop: %iv = phi i32 [ 0, %entry ], [ %iv.inc, %loop ] %iv.inc = add i32 %iv, 1 %cmp = icmp ne i32 %iv.inc, 10 %cmp.zext = zext i1 %cmp to i32 br i1 %cmp, label %loop, label %leave leave: ret void }
Printing analysis 'Scalar Evolution Analysis' for function 'f': Classifying expressions for: @f %iv = phi i32 [ 0, %entry ], [ %iv.inc, %loop ] --> {0,+,1}<%loop> U: [0,10) S: [0,10) Exits: 9 LoopDispositions: { %loop: Computable } %iv.inc = add i32 %iv, 1 --> {1,+,1}<%loop> U: [1,11) S: [1,11) Exits: 10 LoopDispositions: { %loop: Computable } %cmp.zext = zext i1 %cmp to i32 --> (zext i1 %cmp to i32) U: [0,2) S: [0,2) Exits: 0 LoopDispositions: { %loop: Variant } Determining loop execution counts for: @f Loop %loop: backedge-taken count is 9 Loop %loop: max backedge-taken count is 9 Loop %loop: Predicated backedge-taken count is 9 Predicates: Loop %loop: Trip multiple is 10
Hi,
As per comments over reivew https://reviews.llvm.org/D38494.
Even if we fine tune the condition for calling evaluateForICmp we shall still be polluting the scalar evolution of zext (please refer to D38494 to test example). Also the LoopDisposition for zext will now show in-variant where as its variant as in last iteration value to zext goes low i.e. false.
One full proof method [ not sure if optimal, do suggest better options ] could be to reduce the trip count of loop by one ie. make its range from start to end-1 and copy to contents of loop outside the loop for last iteation of loop. This shall maintain the sanity of scalar evolution analysis since now evaluateForICmp can return TRUE when compare condition matches with the latch condition and any cline using this info shall compute correct LoopDispositions.