[X86][AVX] Truncate vectors with PACKSS/PACKUS on AVX2 targets

Authored by RKSimon on Mar 25 2021, 3:34 AM.


[X86][AVX] Truncate vectors with PACKSS/PACKUS on AVX2 targets

Until AVX512 we don't have any vector truncation instructions, and always lower using shuffles instead.

combineVectorTruncation performs this earlier than lowering as it makes it easier to use any sign/zero-extended bits in the truncated bits with PACKSS/PACKUS to perform the shuffle.

We currently don't attempt to use combineVectorTruncation on AVX2 targets as in the past 256-bit PACKSS/PACKUS tended to cause 128-bit lane shuffle regressions - but these should now be all resolved with combineHorizOpWithShuffle and in all cases we now reduce the amount of cross-lane shuffling and variable shuffle mask usage.

Differential Revision: https://reviews.llvm.org/D96609