If C0 is a mask and C1 shifts out all the masked bits (to
essentially compare two subsets of X), we can arbitrarily re-order
shift as srl or shl.
If C1 (shift amount) is a power of 2, we can replace the and+shift
with a rotate.
Otherwise, based on target preference we can arbitrarily swap shl
and shl in/out to get better constants.
On x86 we can use this re-ordering to:
- get better and constants for C0 (zero extended moves or avoid imm64).
- covert srl to shl if shl will be implementable with lea or add (both of which can be preferable).
This function name seems very general and yet the transform seems very specific.