If in addition to AVX512BW (that provides {k}<->{i8,i16} casts and i16 shuffles),
we have AVX512VBMI, which provides i8 shuffles, we are in an optimal situation.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/X86/X86TargetTransformInfo.cpp | ||
---|---|---|
3677 | AVX512F will use (pretty awful) vXi32 shuffles: https://simd.godbolt.org/z/YYzjaf7Wh |
llvm/lib/Target/X86/X86TargetTransformInfo.cpp | ||
---|---|---|
3677 | Yes, those are pretty awful, i'm not sure if there's much hope for plain AVX512F, |
Comment Actions
LGTM (VBMI)
llvm/lib/Target/X86/X86TargetTransformInfo.cpp | ||
---|---|---|
3677 | Its probably worth adding them instead of scalarization bailout though. |
Comment Actions
Thank you for the review!
llvm/lib/Target/X86/X86TargetTransformInfo.cpp | ||
---|---|---|
3677 | I mean, yes, it is just not obvious to me how to do that without hardcoding them. |
AVX512F will use (pretty awful) vXi32 shuffles: https://simd.godbolt.org/z/YYzjaf7Wh