[X86][Costmodel] `getReplicationShuffleCost()`: promote 1 bit-wide elements to…

Authored by lebedev.ri on Nov 19 2021, 4:55 AM.


[X86][Costmodel] getReplicationShuffleCost(): promote 1 bit-wide elements to 8 bit when have AVX512BW+AVX512VBMI

If in addition to AVX512BW (that provides {k}<->{i8,i16} casts and i16 shuffles),
we have AVX512VBMI, which provides i8 shuffles, we are in an optimal situation.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D114071