The (cmp (and X, Y) 0) pattern is greedy and ends up forming a TESTrr and consuming the and when it might be better to use one of the BMI/TBM like BLSR or BLSI.
This patch moves removes the pattern from isel and adds a post processing check to combine TESTrr+ANDrr into just a TESTrr. With this patch we are able to select the BMI/TBM instructions, but we'll also emit a TESTrr when the result is compared to 0. In many cases the peephole pass will be able to use optimizeCompareInstr to remove the TEST, but its probably not perfect.
This is a strange/interesting test.
If %a is zero, then %t1 is also zero.
If %a is not zero, then %t1 has exactly one bit set.
-->
Testing if %t1 is equal to 0, is equivalent to testing if %a is 0.
The only case where %t2 is TRUE, is if %a is 0.
This whole logic could be folded into a icmp + select. So we don't even need to select a BLSI.
This sequence should be optimized at IR level. I didn't test if it is what happens.
That being said. I take that the the purpose of this test was different. Probably, this test should be rewritten in a way that doesn't expose that simplification?