xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well.
My aim with this to improve the ability to recognise bitmasks that can be coverted to shuffles.
Elena - I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out preventing testing of the mmask bitops. By replacing the bitcasts with loads I can get almost the same result.
I'm not sure that it is safe for illegal types. You don't cover this in the tests.
If you want to do the transformation before type promotion, you, probably, should check that the types are legal.