The early out for AVX2 in lowerV2X128VectorShuffle is positioned in a weird spot below some shuffle mask equivalency checks.
But I think we want to allow VPERMQ for any unary shuffle.
Paths
| Differential D37893
[X86] Prefer VPERMQ over VPERM2F128 for any unary shuffle, not just the ones that can be done with a insertf128 ClosedPublic Authored by craig.topper on Sep 15 2017, 12:48 AM.
Details Summary The early out for AVX2 in lowerV2X128VectorShuffle is positioned in a weird spot below some shuffle mask equivalency checks. But I think we want to allow VPERMQ for any unary shuffle.
Diff Detail Event Timeline
This revision is now accepted and ready to land.Sep 15 2017, 10:55 AM Closed by commit rL313373: [X86] Prefer VPERMQ over VPERM2F128 for any unary shuffle, not just the ones… (authored by ctopper). · Explain WhySep 15 2017, 11:12 AM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 115365 lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/avx-vperm2x128.ll
test/CodeGen/X86/avx512-shuffles/partial_permute.ll
test/CodeGen/X86/vector-shuffle-256-v8.ll
|
Does this even need to be inside the zero test? If V2 is undef, then V1 should not be zero. Can we place it before lowerVectorShuffleAsBlend?