The patch adds support for combining a build vector to a shuffle.
When the build vector is of extracted elements from 2 vectors (vec1, vec2) where vec2 is 2 times smaller than vec1. Without the patch this would generate a bunch of extract/insert.
Not sure if the changed AArch64 test got better or worse though...
In the case above you'll find (VT.getSizeInBits() % InVT1.getSizeInBits() == 0). Is this your case
(VT.getSizeInBits() % InVT2.getSizeInBits() == 0) ?