As vector shuffles can only reference two inputs many (V)INSERTPS patterns end up being split over two targets shuffles.
This patch adds combines to attempt to combine (V)INSERTPS nodes with input/output nodes that are just zeroing out these additional vector elements.
Why do we need this intermediate variable? Ie, couldn't we just set the appropriate elements of the mask rather than this bitvector which then gets copied to the mask?