Current lowerVECTOR_SHUFFLEAsVNSRL cannot support trailing undef and
other vnsrl pattern (e.g., an arithmetic sequence with 4 difference).
This commit will support power of 2 difference and use an iterative way
to split vector_shuffle into a series of vnsrl.
Some pattern (e.g., vnsrl_4_undef_i8) does not use vector_shuffle
because general DAG combiner cannot make build_vector into
vector_shuffle.
Also, isVnsrlShuffle is also provided. I expect isShuffleMaskLegal can
call isVnsrlShuffle in the future.