1 << (cttz X) --> -X & X
https://alive2.llvm.org/ce/z/qv3E9e
This creates an extra use of the input value, so that's generally not preferred, but there are advantages to this direction:
- 'negate' and 'and' allow for better analysis than 'cttz'.
- This is more likely to induce follow-on transforms (in the example from issue #60801, we'll get the decrement pattern).
- The more basic ALU ops are more likely to result in better codegen across a variety of targets.
This won't solve the motivating bugs (see issue #60799) because we do not recognize the redundant icmp+sel, and the x86 backend may not have the pattern-matching to produce the optimal BMI instructions.