Tries to perform
(Xshr (add (i2^n X), (i2^n Y)), 2^(n-1))
-> (icmp ult (add (trunc X), (trunc Y)), X)
where
- Only the K leading bits of X and Y can be non-zero.
- The add is only used by the shr, or by iK (or narrower) truncates.
- The lshr type has more than 2 bits (other types are boolean math).
- X > 1
This seems to be a pattern that just comes from OpenCL front-ends, so adding DAG/GISel combines doesn't seem to be worth the complexity.
Original patch D107552 by @abinavpp - adapted to use (a + b < a) instead of uaddo following discussion on the review.
See this issue https://github.com/RadeonOpenCompute/ROCm/issues/488