[X86][Costmodel] `getReplicationShuffleCost()`: promote 8 bit-wide elements to 32 bit when no AVX512VBMI

Authored by lebedev.ri on Sun, Nov 14, 10:15 AM.



Currently X86TTIImpl::getInterleavedMemoryOpCostAVX512() asks about i8 elt type,
so this change does affect vectorization. In the end, it will ask about i1.

We should also try to promote to i16 if we have AVX512BW, i'll do that in a follow-up.
All costs here look good, i've added the missing truncation costs in preparatory patches.

lebedev.ri created this revision.Sun, Nov 14, 10:15 AM
RKSimon accepted this revision.Mon, Nov 15, 7:39 AM


This revision is now accepted and ready to land.Mon, Nov 15, 7:39 AM


Thank you for the review!

Promoting i8->i16 when BW but not VBMI appears to be mostly unprofitable, so i guess i'll skip directly to i1.