[X86] Use parity flag from byte test/cmp instruction for __builtin_parity when input fits in 8 bits.
If the upper bits of the __builtin_parity idiom are known to be
0 we were previously emitting an xor with 0 to get the parity flag.
But we can use cmp/test instead which may expose opportunities for
load folding or combining an AND.