This rarely comes up because most vselect are lowered with actually
avx512 mask instructions, but is an improvement in the rare cases.
Details
Diff Detail
- Repository
 - rG LLVM Github Monorepo
 
Event Timeline
| llvm/lib/Target/X86/X86ISelDAGToDAG.cpp | ||
|---|---|---|
| 1037 | Update comment  | |
| llvm/lib/Target/X86/X86ISelDAGToDAG.cpp | ||
|---|---|---|
| 1044 | Why is the PCMPGT check needed?  | |
| llvm/lib/Target/X86/X86ISelDAGToDAG.cpp | ||
|---|---|---|
| 1043 | It's enough to check single hasVLX.  | |
| llvm/lib/Target/X86/X86ISelDAGToDAG.cpp | ||
|---|---|---|
| 1044 | Wouldn't a numsignbits check be better than PCMPGT?  | |
| llvm/lib/Target/X86/X86ISelDAGToDAG.cpp | ||
|---|---|---|
| 1044 | 
 blendv only uses the sign-bit of the control. For vpternlog to be a replacement, the control needs to be in mask form (elements either -1/0). pcmpgt is just a check for "is the control in mask form". Maybe (sext (setcc ...)) would also work but didn't see any codegen changes from it.  | |
| 1044 | What is numsignbits check?  | |
| llvm/lib/Target/X86/X86ISelDAGToDAG.cpp | ||
|---|---|---|
| 1044 | 
 Ill use CurDAG->computeNumSignBits, if we are being correct that should prove it.  | |
| llvm/test/CodeGen/X86/vselect-pcmp.ll | ||
|---|---|---|
| 37 ↗ | (On Diff #502256) | AVX512F basically means knights landing - and even though you'd have to use the zmm variant - vpternlogq is a LOT faster than vpblendvb on KNL  | 
| llvm/test/CodeGen/X86/vselect-pcmp.ll | ||
|---|---|---|
| 37 ↗ | (On Diff #502256) | 
 Could maybe see for ymm->zmm, but xmm->zmm will then req a vzeroupper, will require a stall for core to prepare zmm usage (if no zmm around), and increase license. Also could potentially be dangerous if its SSE encoding around it.  | 
Update comment