vperm2x128 instructions have the special ability (aka free hardware capability) to shuffle zero values into a vector.
This patch recognizes that type of shuffle and generates the appropriate control byte.
Note: I have a follow-on patch to convert vperm2 intrinsics with zero masks into generic shuffles. That should close the loop on this special-purpose x86 permute.
Note: This mask {0, 1, 6, 7} is a v4x64 blend, but we've already tried "lowerVectorShuffleAsBlend()" above. Therefore, this check is redundant, and I've removed it. There was no change in the regression tests after removing this check.