bo (ins undef, X, Z), (ins undef, Y, Z) --> ins undef, (bo X, Y), Z
This is another step in generic vector narrowing. It's also a step towards more horizontal op formation specifically for x86 (although we still failed to match those in the affected tests).
The scalarization cases are also not optimal (we should be scalarizing those), but it's still an improvement to use a narrower vector op when we know part of the result must be undef because both inputs are undef in some vector lanes.
I think a similar match but checking for a constant operand might help some of the cases in D51553.