Page MenuHomePhabricator

[X86][AVX512] Adding new patterns for extract_subvector of vXi1
ClosedPublic

Authored by m_zuckerman on Oct 25 2017, 8:54 AM.

Details

Summary

extract subvector of vXi1 from vYi1 is poorly supported by LLVM and most of the time end with an assertion.
This patch fixes this issue by adding new patterns to the TD file.

Diff Detail

Repository
rL LLVM

Event Timeline

m_zuckerman created this revision.Oct 25 2017, 8:54 AM
m_zuckerman edited the summary of this revision. (Show Details)Oct 25 2017, 8:57 AM
craig.topper added inline comments.Oct 25 2017, 9:10 AM
lib/Target/X86/X86InstrAVX512.td
2944 ↗(On Diff #120266)

Why can't we use an SDNodeXForm to rewrite the immediate instead of generating so many patterns? See getInsertVINSERTImmediate in X86ISelDAGToDAG.cpp.

test/CodeGen/X86/avx512-extract-subvector-load-store.ll
2 ↗(On Diff #120266)

I believe we're trying to avoid using -mcpu in tests. Please use -mattr

craig.topper edited edge metadata.Oct 26 2017, 1:59 PM

Unless I'm mistaken, KSHIFTRB is only legal with AVX512DQ but that doesn't seem to be accounted for here.

lib/Target/X86/X86InstrAVX512.td
2944 ↗(On Diff #120423)

Please fix the indentation here. And I suspect this overflows 80 columns.

2951 ↗(On Diff #120423)

fix indentation and 80 columns.

2981 ↗(On Diff #120423)

How is this different than line 2965?

m_zuckerman added inline comments.Oct 27 2017, 8:16 AM
lib/Target/X86/X86InstrAVX512.td
2981 ↗(On Diff #120423)

The difference here is the type of the instruction word vs byte. Byte, as you said, needs DQ while word doesn't need it. The second pattern will be caught If a target doesn't have DQ.

We shouldn’t rely on pattern order in the td file if we can avoid it. So a pattern that is for when DQI is disabled should have a no DQI predicate.

But in this case I see no good reason to have two different patterns. Why can’t we use KSHIFTW even when DQI is enabled? Is there a downside?

lib/Target/X86/X86InstrAVX512.td
2967 ↗(On Diff #120614)

Are we missing a VK8->VK2 pattern without DQI?

test/CodeGen/X86/avx512-extract-subvector-load-store.ll
2 ↗(On Diff #120614)

Add a command line without DQI as well?

We shouldn’t rely on pattern order in the td file if we can avoid it. So a pattern that is for when DQI is disabled should have a no DQI predicate.

But in this case I see no good reason to have two different patterns. Why can’t we use KSHIFTW even when DQI is enabled? Is there a downside?

You are right, we can use the KSHIFTW.

m_zuckerman marked 3 inline comments as done.Oct 29 2017, 6:19 AM
This revision is now accepted and ready to land.Oct 30 2017, 10:43 AM
This revision was automatically updated to reflect the committed changes.