Current lowerVECTOR_SHUFFLEAsVNSRL cannot support trailing undef and
other vnsrl pattern (e.g., an arithmetic sequence with 4 difference).
This commit will support power of 2 difference and use an iterative way
to split vector_shuffle into a series of vnsrl.
Some pattern (e.g., vnsrl_4_undef_i8) does not use vector_shuffle
because general DAG combiner cannot make build_vector into
vector_shuffle.
Also, isVnsrlShuffle is also provided. I expect isShuffleMaskLegal can
call isVnsrlShuffle in the future.
What if Mask[0] is -1 and Mask[1] is 1. The different will be 2. I don't think these checks block that.
I guess maybe it's handled by the std::any_of later?
I'd feel better if we checked Mask[0] and Mask[1] are >= 0 before the subtract.