Try to lower a BUILD_VECTOR composed of extract-extract chains that can be
reasoned to be a permutation of a vector by indices in a non-constant vector.
We saw this pattern created by ISPC, which resolts to creating it due to the
requirement that shufflevector's mask operand be a *constant* vector.
I didn't check this but we could possibly use this pattern for lowering the X86 permute
C-instrinsics instead of llvm.x86 instrinsics.
This change can be followed by more improvements:
- Handle vectors with undef elements.
- Utilize pshufb and zero-mask-blending to support more effiecient construction of vectors with constant-0 elements.
- Use smaller-element vectors of same width, and "interpolate" the indices, when no native operation available.