extract subvector of vXi1 from vYi1 is poorly supported by LLVM and most of the time end with an assertion.
This patch fixes this issue by adding new patterns to the TD file.
Details
Diff Detail
Event Timeline
lib/Target/X86/X86InstrAVX512.td | ||
---|---|---|
2945 | Why can't we use an SDNodeXForm to rewrite the immediate instead of generating so many patterns? See getInsertVINSERTImmediate in X86ISelDAGToDAG.cpp. | |
test/CodeGen/X86/avx512-extract-subvector-load-store.ll | ||
3 | I believe we're trying to avoid using -mcpu in tests. Please use -mattr |
lib/Target/X86/X86InstrAVX512.td | ||
---|---|---|
2981 | The difference here is the type of the instruction word vs byte. Byte, as you said, needs DQ while word doesn't need it. The second pattern will be caught If a target doesn't have DQ. |
We shouldn’t rely on pattern order in the td file if we can avoid it. So a pattern that is for when DQI is disabled should have a no DQI predicate.
But in this case I see no good reason to have two different patterns. Why can’t we use KSHIFTW even when DQI is enabled? Is there a downside?
lib/Target/X86/X86InstrAVX512.td | ||
---|---|---|
2967 | Are we missing a VK8->VK2 pattern without DQI? | |
test/CodeGen/X86/avx512-extract-subvector-load-store.ll | ||
3 | Add a command line without DQI as well? |
Please fix the indentation here. And I suspect this overflows 80 columns.