As noted in PR11210:
https://bugs.llvm.org/show_bug.cgi?id=11210
...fixing this should allow us to eliminate x86-specific masked store intrinsics in IR.
(Although more testing will be needed to confirm that.)
I don't know anything about SKX, so I don't know if the existing code is optimal or not. If we want to handle that case too, we could add a check for a 'PCMPGTM' opcode or try to fold this earlier when the node is still a setcc.
Is there any canonical form of compare-with-all-zeros that can be guaranteed here? Or should the pattern with (pcmplt X, 0) be added also?