This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Handle subregs when folding vmerge into vops
Needs ReviewPublic

Authored by luke on Aug 7 2023, 3:45 AM.

Details

Summary

A vmerge/vmv.v.v can sometimes be inserting a smaller vector into a larger one,
in which case the underlying vop is obscured underneath COPY_TO_REGCLASS and
INSERT_SUBREG nodes:

t22: nxv4i32 = PseudoVMV_V_V_M2 t42, t38, ...
t38: nxv4i32 = INSERT_SUBREG IMPLICIT_DEF:nxv4i32, t40, TargetConstant:i32<4>

t40: v2i32 = COPY_TO_REGCLASS t41, TargetConstant:i64<22>
  t41: nxv1i32,ch = PseudoVLE32_V_MF2 ...

This patch extends the fold peephole to handles this case by peeling off these
extra nodes, and then adding back the necessary insert_subreg/extract_subregs:

t48: nxv4i32 = INSERT_SUBREG t42, t47, TargetConstant:i32<4>
t47: nxv1i32,ch = PseudoVLE32_V_MF2_MASK t46, ...

t46: nxv1i32 = EXTRACT_SUBREG t42, TargetConstant:i32<4>

The subregister being inserted into needs to be the bottom subregister, i.e.
index 0.

Diff Detail

Event Timeline

luke created this revision.Aug 7 2023, 3:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 7 2023, 3:45 AM
luke requested review of this revision.Aug 7 2023, 3:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 7 2023, 3:45 AM
reames added inline comments.Aug 7 2023, 2:51 PM
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
3354

I'm looking at this, and am trying to understand why we get the COPY_TO_REGCLASS here. If I'm reading this right, we should be generating this only when doing a insert_subreg between two equally LMUL-sized register groups. Given that, I think we must have had two insert_subregs originally, and shouldn't one of have been combined away?

luke added inline comments.Aug 8 2023, 10:34 AM
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
3354

IIUC we generate COPY_TO_REGCLASSes when doing either an or insert_subvector extract_subvector between two equally LMUL-sized register groups.
The input graph before isel seems to have both an insert_subvector and extract_subvector, is that the bit that you're saying should be combined?

t0: ch,glue = EntryToken
    t2: nxv4i32,ch = CopyFromReg t0, Register:nxv4i32 %0
          t6: nxv2i32,ch = CopyFromReg t0, Register:nxv2i32 %1
          t9: nxv2i32,ch = CopyFromReg t0, Register:nxv2i32 %2
          t23: nxv2i1 = RISCVISD::VMSET_VL Constant:i64<4>
        t24: nxv2i32 = RISCVISD::ADD_VL t6, t9, undef:nxv2i32, t23, Constant:i64<4>
      t25: v4i32 = extract_subvector t24, Constant:i64<0>
    t26: nxv4i32 = insert_subvector undef:nxv4i32, t25, Constant:i64<0>
    t28: nxv4i1 = RISCVISD::VMSET_VL Constant:i64<8>
  t30: nxv4i32 = RISCVISD::VSLIDEUP_VL t2, t26, Constant:i64<4>, t28, Constant:i64<8>, TargetConstant:i64<1>
t18: ch,glue = CopyToReg t0, Register:nxv4i32 $v8m2, t30
t19: ch = RISCVISD::RET_GLUE t18, Register:nxv4i32 $v8m2, t18:1

The insert_subvector gets selected as an INSERT_SUBREG:

t0: ch,glue = EntryToken
    t2: nxv4i32,ch = CopyFromReg t0, Register:nxv4i32 %0
          t6: nxv2i32,ch = CopyFromReg t0, Register:nxv2i32 %1
          t9: nxv2i32,ch = CopyFromReg t0, Register:nxv2i32 %2
          t23: nxv2i1 = RISCVISD::VMSET_VL Constant:i64<4>
        t24: nxv2i32 = RISCVISD::ADD_VL t6, t9, undef:nxv2i32, t23, Constant:i64<4>
      t25: v4i32 = extract_subvector t24, Constant:i64<0>
    t36: nxv4i32 = INSERT_SUBREG undef:nxv4i32, t25, TargetConstant:i32<4>
  t30: nxv4i32 = PseudoVSLIDEUP_VI_M2 t2, t36, TargetConstant:i64<4>, TargetConstant:i64<8>, TargetConstant:i64<5>, TargetConstant:i64<1>
t18: ch,glue = CopyToReg t0, Register:nxv4i32 $v8m2, t30
t37: i64 = TargetConstant<20>
t19: ch = PseudoRET Register:nxv4i32 $v8m2, t18, t18:1

And then the extract_subvector gets selected as a COPY_TO_REGCLASS

t0: ch,glue = EntryToken
    t2: nxv4i32,ch = CopyFromReg t0, Register:nxv4i32 %0
          t6: nxv2i32,ch = CopyFromReg t0, Register:nxv2i32 %1
          t9: nxv2i32,ch = CopyFromReg t0, Register:nxv2i32 %2
          t23: nxv2i1 = RISCVISD::VMSET_VL Constant:i64<4>
        t24: nxv2i32 = RISCVISD::ADD_VL t6, t9, undef:nxv2i32, t23, Constant:i64<4>
      t38: v4i32 = COPY_TO_REGCLASS t24, TargetConstant:i64<20>
    t36: nxv4i32 = INSERT_SUBREG undef:nxv4i32, t38, TargetConstant:i32<4>
  t30: nxv4i32 = PseudoVSLIDEUP_VI_M2 t2, t36, TargetConstant:i64<4>, TargetConstant:i64<8>, TargetConstant:i64<5>, TargetConstant:i64<1>
t18: ch,glue = CopyToReg t0, Register:nxv4i32 $v8m2, t30
t19: ch = PseudoRET Register:nxv4i32 $v8m2, t18, t18:1
luke updated this revision to Diff 550700.Aug 16 2023, 4:53 AM

Remove extraneous dumpr and rebase

reames added inline comments.Aug 17 2023, 11:30 AM
llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
3354

https://reviews.llvm.org/D158201 should improve this. Not sure it entirely resolves the need for the COPY_TO_REGCLASS, but it at least picks up the redundant pair.