I believe, this effectively completes X86TTIImpl::getReplicationShuffleCost()
for AVX512, other than the question of handling plain AVX512F,
where we end up with some really ugly "shuffles",
but then is there any CPU's that support AVX512, but not AVX512DQ/AVX512BW?
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/X86/X86TargetTransformInfo.cpp | ||
---|---|---|
3776 | AVX512F or AVX512DQ? |
llvm/lib/Target/X86/X86TargetTransformInfo.cpp | ||
---|---|---|
3776 | "<we can> promote to i32, AVX512F <then provides support for shuffling in that type>." |
llvm/lib/Target/X86/X86TargetTransformInfo.cpp | ||
---|---|---|
3776 | so why not promote for AVX512F only targets? |
llvm/lib/Target/X86/X86TargetTransformInfo.cpp | ||
---|---|---|
3776 | I'm not sure i understand. AVX512DQ is the instruction set that provides VPMOVM2[DQ] / VPMOV[DQ]2M instructions. |
llvm/lib/Target/X86/X86TargetTransformInfo.cpp | ||
---|---|---|
3776 | In other words, are you saying that we should always promote to i1<->i32 as a fallback, |
AVX512F or AVX512DQ?