For cases where we are BITCASTing to vectors of smaller elements, then if the entire source was a splatted sign (src's NumSignBits == SrcBitWidth) we can say that the dst's NumSignBit == DstBitWidth, as we're just splitting those sign bits across multiple elements.
We could generalize this but at the moment the only use case I have is to peek through bitcasts to vector comparison results.
psubus.ll - @spatel I think you've encountered the PAND -> PBLENDVB+ZERO issue before - where is the best place to fix it? Do we just need to improve VSELECT/SHRUNKBLEND handling?
This code fits in 4 instructions:
cmp
cmp
and
pmovmskpd
What happens without "and", just cmp + bitcast ?