Enable WidenVecOp_VECREDUCE_SEQ for scalable vector by inserting some splatted neutral vectors.
e.g.
when widen <vscale x 10 x half> to <vscale x 16 x half> in index (10, 12, 14) inserted nxv2f16:splat<neutral element>
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp | ||
---|---|---|
6153 | Why was this blank line removed? |
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp | ||
---|---|---|
6152 | I may be missing something here, but I'm not sure I understand the need for a GCD or for a loop that generates WideVT.getVectorMinNumElements() INSERT_SUBVECTORs. Should this code not just insert VecOp into a splat(NeutralElem) of type WideVT using INSERT_SUBVECTOR? i.e. vecreduce_seq(<vscale x 10 x half> op) <=> vecreduce_seq( insert_subvector(op, <vscale x 16 x half> splat(%neutral.elem), /*idx=*/0)) where <vscale x 10 x half> is OrigVT and <vscale x 16 x half> is WideVT. |
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp | ||
---|---|---|
6152 | I think you meant insert_subvector(<vscale x 16 x half> splat(%neutral.elem), op, /*idx=*/0). op would be the subvector so should be the second argument. That would need to go through WidenVecOp_INSERT_SUBVECTOR, but it will fail because InVec isn't undef. |
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp | ||
---|---|---|
6152 | If we do like this vecreduce_seq(insert_subvector(<vscale x 16 x half> splat, vecop, /*idx=*/0)), that would go through WidenVecOp_INSERT_SUBVECTOR because of vecop need to widen. |
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp | ||
---|---|---|
6153 | Mistakenly deleted blank line. Restored back. |
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp | ||
---|---|---|
6152 | Thanks, that makes sense. | |
6153 | For SVE there is still the problem that the type action of SplatVT may be TypeWiden as well, because we haven't implemented all support for nxv1 types yet. The approach taken here seems valid though, it just means that we can't have the same tests for SVE yet. | |
llvm/test/CodeGen/RISCV/rvv/vreductions-fp-sdnode.ll | ||
1081 | Could you also add @vreduce_ord_fadd_nxv6f16 and @vreduce_ord_fadd_nxv10f16 to llvm/test/CodeGen/AArch64/vecreduce-fadd-legalization-strict.ll as well ? (and then just re-generate the CHECK lines using the update_lcc_test_checks.py script) |
llvm/test/CodeGen/RISCV/rvv/vreductions-fp-sdnode.ll | ||
---|---|---|
1081 | And @vreduce_ord_fadd_nxv12f16 as well. |
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp | ||
---|---|---|
6153 | Should I need to add getTypeAction(SplatVT) != TargetLowering::TypeWidenVector to avoid this problem now? |
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp | ||
---|---|---|
6153 | No I don't think that's necessary, but thanks for the suggestion! When we fix the nxv1 legalisation of insert_subvector, we'll get the support added in this patch for free. And the compiler will currently crash with an error either way :) It was more just an observation. |
llvm/test/CodeGen/RISCV/rvv/vreductions-fp-sdnode.ll | ||
---|---|---|
1081 | This vecreduce-fadd-legalization-strict.ll file is only for NEON feature. So, I add tests in sve-fp-reduce.ll. Is it OK? |
LGTM!
llvm/test/CodeGen/RISCV/rvv/vreductions-fp-sdnode.ll | ||
---|---|---|
1081 | Thanks, I hadn't seen that. That's fine! |
I may be missing something here, but I'm not sure I understand the need for a GCD or for a loop that generates WideVT.getVectorMinNumElements() INSERT_SUBVECTORs. Should this code not just insert VecOp into a splat(NeutralElem) of type WideVT using INSERT_SUBVECTOR?
i.e.
where <vscale x 10 x half> is OrigVT and <vscale x 16 x half> is WideVT.