This patch adds shuffle matching for the SSE3 MOVDDUP, MOVSLDUP and MOVSHDUP instructions. The big use of these being that they avoid many single source shuffles from needing to use dual source instrutions such as SHUFPD/SHUFPS, avoiding extra moves and allowing more load folds.
Adding these instructions uncovered an issue in XFormVExtractWithShuffleIntoLoad which crashed on single operand shuffle instructions. It also involved fixing getTargetShuffleMask to correctly identify theses instructions as unary shuffles.
I also added a shuffle mask decoder for MOVDDUP.