This is PR37104.
PR6773 will introduce an IR canonicalization that is likely bad for the end assembly.
Previously, andl+andn/andps+andnps / bic/bsl would be generated. (see @out)
Now, they would no longer be generated (see @in).
So we need to make sure that they are still generated.
If the mask is constant, right now i always unfold it.
Else, i use hasAndNot() TLI hook.
For now, only handle scalars.
https://rise4fun.com/Alive/bO6
I *really* don't like the code i wrote in DAGCombiner::unfoldMaskedMerge().
It is super fragile. Is there something like IR Pattern Matchers for this?
After stepping through more of your tests, I see why this is ugly.
We don't have to capture the intermediate values if the hasOneUse() checks are in the lambda(s) though. What do you think of this version: