The current 'big vectors' stack folded reload testing pattern is very bulky and makes it difficult to test all instructions as big vectors will tend to use only the ymm instruction implementations.
This patch changes the tests to use a nop call that lists explicit xmm registers as sideeffects, with this we can force a partial register spill of the relevant registers and then check that the reload is correctly folded. The asm generated only adds the forced spill, a nop instruction and a couple of extra labels (a fraction of the current approach).
If people are happy with this new approach I can begin adding more exhaustive tests (starting with the avx1 instructions before moving on to the sse versions) and properly uncover any issues with the memory folding tables.
I've added some extra tests (the xmm versions of some of the existing folding tests) as a starting point.