All this optimization cares about is knowing how many low bits of LHS is known to be zero and whether that means that the result is 0 or greater than the RHS constant. It doesn't matter where the zeros in the low bits came from. So we don't need to specifically look for an AND. Instead we can use known bits.
Details
Diff Detail
Event Timeline
LGTM.
Given that we're already calling computeKnownBits here, this seems ok. Is there a real motivating benchmark though? If so, I think you want to add a reduction of that, so we're less likely to lose it if/when there's a rewrite of instcombine.
Also, I commented on D38206 before I saw this patch, so this patch may make that comment moot. Although similar to this case, another test to check a less restricted version would be good if you've seen it in the real world.
This didn't come from any real benchmark. The one use restriction on foldICmpAndShift was something I saw in real code. Though it was with a ashr, so I still have another patch to come. Then while I was reviewing the nearby code I saw the missing inverse case on this transform which I fixed last week in D38065. But then I started thinking about it more and thought maybe we should handle any kind of zeros and not just zeros created by ands.