We have a number of useful lowering strategies for VBROADCAST instructions (both from memory and register element 0) which the 128-bit form of the MOVDDUP instruction can make use of.
This patch tweaks lowerVectorShuffleAsBroadcast to enable it to broadcast 2f64 args using MOVDDUP as well.
It does require a slight tweak to the lowerVectorShuffleAsBroadcast mechanism as the existing MOVDDUP lowering uses isShuffleEquivalent which can match binary shuffles that can lower to (unary) broadcasts.