This is needed to make the v64i8 and v32i16 types legal for the 512-bit VBMI instructions. Fixes PR30912.
This is the LLVM side part of this. Clang part is here https://reviews.llvm.org/D26306
I removed the -mcpu=skx from the vbmi intrinsic test cases so that we could be sure vbmi was enabling bwi to make the types legal. Somehow that changed the order of the addb instructions. Maybe something to do with the scheduler model?