Given a shuffle mask like <3, 0, 1, 2, 7, 4, 5, 6> for v8i8, we can
reinterpret it as a shuffle of v2i32 where the two i32s are bit rotated, and
lower it as a vror.vi (if legal with zvbb enabled).
We also need to make sure that the larger element type is a valid SEW, hence
the tests for zve32x.
X86 already did this, so I've extracted the logic for it and put it inside
ShuffleVectorSDNode so it could be reused by RISC-V. I originally tried to add
this as a generic combine in DAGCombiner.cpp, but it ended up causing worse
codegen on X86 and PPC.
As an example, If the rotateAmt is 24 then on RV32 the constant comes out as:
I tried handling this case in lowerBuildVectorOfConstants to lower it as a v1i64 vmv_v_x_vl, with the constant reinterpreted across the elements, but it doesn't seem to catch any other cases since this pattern doesn't seem to be generated anywhere else.