(i8 X ^ 128) & (i8 X s>> 7) --> usubsat X, 128
I haven't found a generalization of this identity:
https://alive2.llvm.org/ce/z/_sriEQ
Note: I was actually looking at the first form of the pattern in that link, but that's part of a long chain of potential missed transforms in codegen and IR....that I hope ends here!
The predicates for when this is profitable are a bit tricky. This version of the patch excludes multi-use but includes custom lowering (as opposed to legal only).
On x86 for example, we have custom lowering for some vector types and that uses umax and sub. So to enable that fold, we need add use checks to avoid regressions. Even with legal-only lowering, we could see code with extra reg move instructions for extra uses, so that constraint would have to be eased very carefully to avoid penalties.
What about the same thing with X +/- 128 instead of X ^ 128? Do they already get canonicalized to the XOR version?