Following this RFC.
This patch defines the i1 type as illegal in the X86 backend for AVX512.
for DAG operations on <N x i1> types (build vector, extract vector element, ...) i8 is used, and should be truncated/extended.
A v1i1 MVT is added to support scalar instruction that use masks, like floating point compare.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
I still need to look at this some more, but here a few comments.
include/llvm/CodeGen/MachineValueType.h | ||
---|---|---|
58 ↗ | (On Diff #95897) | This needs to be rebased on top of a commit that went in today that changed this for ScalableVector MVTs. |
448 ↗ | (On Diff #95897) | Can you put the scalar type before the vector type? That would be more consistent with the other code. |
lib/Target/X86/X86InstrInfo.cpp | ||
6318 ↗ | (On Diff #95897) | Why is this code coming back? We should use the right SUBREG operations in the output patterns so that we only do GR32/GR64 copies. |
lib/Target/X86/X86CallingConv.td | ||
---|---|---|
76 ↗ | (On Diff #96536) | Update comment |
77 ↗ | (On Diff #96536) | Merge with the line below and have 4 types? |
150 ↗ | (On Diff #96536) | Update comment |
151 ↗ | (On Diff #96536) | Shouldn't this be merge with the line below to list 3 types? |
492 ↗ | (On Diff #96536) | Update comment. |
494 ↗ | (On Diff #96536) | Isn't the original line redundant? |
592 ↗ | (On Diff #96536) | Update comment |
594 ↗ | (On Diff #96536) | Isn't the old line here redundant? |
805 ↗ | (On Diff #96536) | Fix the comment |
807 ↗ | (On Diff #96536) | Isn't the second line here redundant? |
827 ↗ | (On Diff #96536) | Merge with the line below and fix comment |
840 ↗ | (On Diff #96536) | Merge with the line below and fix comment |
871 ↗ | (On Diff #96536) | Merge with the line below and fix comment |
879 ↗ | (On Diff #96536) | Merge with the line below and fix comment. |
900 ↗ | (On Diff #96536) | Merge with the line below and fix comment |
lib/Target/X86/X86ISelLowering.cpp | ||
6990 ↗ | (On Diff #96536) | Why did this if statement need to change? |
14037 ↗ | (On Diff #96536) | What's the different between this and the original line? Doesn't Op's VT have to match the elements of the vector? |
19127 ↗ | (On Diff #96536) | This seems like an unrelated cleanup. Commit separately? Why are curly braces being added? |
29618 ↗ | (On Diff #96536) | Can we reuse CondVT here instead of calling Cond.getValueType() again? Why cant' we use DAG.getAllOnesConstant anymore? Does the constant have different with than CondVT's element type now? What does that do? |
lib/Target/X86/X86InstrAVX512.td | ||
2303 ↗ | (On Diff #96536) | Align v16i1 to same column as v8i1 above. |
2316 ↗ | (On Diff #96536) | The identation on the VK16 seems off. Same with VK8 on the pattern below. |
lib/Target/X86/X86InstrInfo.cpp | ||
6333 ↗ | (On Diff #96536) | Not sure why there are two blanks lines in the current code, but don't delete the extra one in this patch. |
lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
6990 ↗ | (On Diff #96536) | just leftovers from other changes i had here which weren't needed. |
14037 ↗ | (On Diff #96536) | since i1 is illegal, Op VT would be i8 when extracting from i1 vectors. it is ok for the return type of extract vector element to be wider than the vector elements. |
29618 ↗ | (On Diff #96536) | getAllOnesConstant probably just got lost in a merge. there is no issue with it |
test/CodeGen/X86/avx512-intrinsics.ll | ||
---|---|---|
125 ↗ | (On Diff #98266) | Any ideas what is going on here? |
test/CodeGen/X86/avx512-intrinsics.ll | ||
---|---|---|
125 ↗ | (On Diff #98266) | SelectionDAG.cpp::FoldConstantArithmetic should eliminate this. without this patch the build vector was v16i1 = build vector i1, i1, ... i've tried implementing proper handling for implicit truncation but i'm getting some failure in mips that i still need to investigate. |
LGTM - I'm happy for the missed constant fold to be handled in a followup, add a TODO if you can.
test/CodeGen/X86/avx512-intrinsics.ll | ||
---|---|---|
125 ↗ | (On Diff #98266) | SelectionDAG::FoldConstantVectorArithmetic does the explicit truncation/extension - not sure if you can use that. |
test/CodeGen/X86/avx512-intrinsics.ll | ||
---|---|---|
125 ↗ | (On Diff #98266) | Thanks! |