The MFVSRD is faster than the MFVSRLD instruction and if the input vector is
symmetrical then both instructions produce the same result and we should prefer
the faster one.
This patch mainly looks at symmetrical situations that are known to arise after
a vector doubleword swap.
Maybe we can try an early exit if this works: