The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits.
This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115
I'm not totally happy that we have all this (duplicated) logic in the arm/aarch64 overrides - an alternative would be to add a 'bool IsPartialMatch' argument?