PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.
We can't do much with the undef demanded elts - we should probably only support the (mul X, undef -> 0) pattern the same as regular integer multiplies. I can add support for it if you guys want but I can't see it being used by real world code. Same with constant folding support.