This is an alternative to D59006 that achieves identical results for x86, and also makes an improvement for AArch64. The logic is pushing the limits of target-independence, but this is what it takes to not induce any regressions for x86, and there's no harm to AArch...so better all-around?
Normally, we'd try to improve the generic combines underlying the sub-optimal output that we see in the test diffs, but that did not look easy/possible for the cases I looked at. For example, the AArch bic/bic appears to be missed because one of those is a generic 'not' op, but the other is already a BIC node.