This patch relaxes the SplitOpsAndApply requirements for 128/256/512-bit sized vectors for SSE2/AVX2/AVX512BW so that it splits illegal types by a variety vector sizes down to 128-bits in size and I've updated the AVG combine to demonstrate this with its existing v48i8 + v40i16 test cases.
It splits the vectors into 512, 256 and finally 128-bit vector results and concatenates them back together (using the minimum necessary subvector size).
If the result size is less than 128-bits then it gets passed through in the old way as a single op, but otherwise the result size must be modulo 128-bits.
I've had to put in a limit that the operands must be the same vector width (not type) as the result - otherwise the operand subvector splitting becomes a lot more complex.
clang-format: please reformat the code