There are two equivalent expressions to compute a "masked merge":
a) (m & x) | (~m & y)
b) ((x ^ y) & m) ^ y
Variant a) is preferable when an and-not instruction is available and we already optimize for that (see DAGCombiner::unfoldMaskedMerge). This adds the reverse operation TargetLowering::foldMaskedMerge and uses it in the X86 target when and-not is not available (it's part of the BMI extension so not present on older cores). This speed up cryptographic hash functions (I've seen it in md5 and sha256 implementations).
(style) remove unnecessary braces