[X86][AVX] Enable INSERT_SUBVECTOR(SRC0, SHUFFLE(SRC1)) shuffle combining
Push the insert_subvector up through the shuffle operands to help find more cross-lane shuffles.
The is exposes a couple of minor issues that will be fixed shortly:
Missed broadcast folds - we have a mixture of vzext_load lengths that need cleaning up
combine-sdiv.ll - AVX1 SimplifyDemandedVectorElts failure (hits max depth due to a couple of extra bitcasts).