This patch rectifies icmp instruction combine problem seen because of previous optimization patch for 'ashr/lshr exact' as mentioned in bug 19958.
I admit i wrote the previous patch keeping only equality in mind (not considering all cases is bad practice, will be careful in future).
This patch for now handles 'icmp eq/ne' only for 'ashr/lsher exact'. Added relevant test cases.
For other icmp instructions, some observations:
define i1 @f(i32 %x) {
%y = lshr exact i32 8, %x %cmp = icmp ult i32 %y, 2 ret i1 %cmp
}
Here, in above example, as %x increases, %y decreases and hence comparison with same icmp instruction was giving wrong output.
while in ,
define i1 @f(i32 %x) {
%y = lshr exact i32 -8, %x %cmp = icmp ult i32 %y, -2 ret i1 %cmp
}
as %x increases, %y also increases mathematically. Hence comparison with same icmp instruction is fine in this case.
This was not captured in original patch and hence wrong output. This can be handled in following two ways:
- if const1,const2 are both positive, then icmp instruction should be swapped (something like getSwappedPredicate()). if const1,const2 are both negative, then keep the icmp instruction as it is. Note : getInversePredicate() won't work.
- if const1/const2 are both positive, swap the operands, keeping the icmp instruction as it is. No change if const1,const2 are both negative.
Note : Cases where const1 and const2 have opposite sign - one is positive other is negative are always true/false.
This is handled by simplifyicmp call and it never reaches our patch. I checked it with both ashr/lshr for this case.
I will verify if above two approaches for other icmp instructions are universally true for both ashr/lshr and come up with another patch.
Added a TODO for the same.
Please review this patch which handles equality for ashr/lshr exact.
Any comments/suggestions are most welcomed.
Thanks.
- Suyog