This patch sinks add/mul(shufflevector(insertelement(...), ...), ...) into the basic block in which they are used so that they can then be selected together. This is useful for various MVE instructions, such as vmla and others that take R registers.
Loop tests have been added to the vmla test file to make sure vmlas are generated in loops.
Make sure you run clang-format.