This is an enhancement to the proposal at D45862 where we add logic for the special case when a cmp+select can clearly be reduced to just a bitwise logic instruction. The goal is to remove cases where we are not improving the IR instruction count when doing these select transforms, and in all cases here that is true.
I think this will have the same results on the 3-way compare tests that are proposed in the other patch. We should commit those first, so we can see the diffs here or in the other patch.
I looked some at x86 and AArch64 codegen with this change, and we're going to get more cmov/vblend and csel/bsl because the DAG doesn't have replacements for all of these folds. In some cases, the changes look better and others it seems worse, but I suspect we'll have to run detailed perf tests per uarch to know the true perf outcome.