This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Narrow types of index operand matched pattern (shl_vl (zext_vl), C)
AbandonedPublic

Authored by liaolucy on Aug 17 2023, 3:08 AM.

Details

Summary

Similar to D154687, we can optimize fix length vector.

lowerBUILD_VECTOR lower BUILD_VECTOR to RISCVISD::VMV_V_X_VL.
Add isVMV_V_X_VLOfConstant, which implements similar functionality of isBuildVectorOfConstantSDNodes.

extract_subvector + insert_subvector can be optimize in
visitINSERT_SUBVECTOR.
Optimization is advanced for better type narrowing.

Diff Detail

Event Timeline

liaolucy created this revision.Aug 17 2023, 3:08 AM
liaolucy requested review of this revision.Aug 17 2023, 3:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2023, 3:08 AM
fakepaper56 added inline comments.Aug 17 2023, 5:02 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
11229

The combine code exists in DAGCombiner::visitINSERT_SUBVECTOR. Why do we need this here?

liaolucy added inline comments.Aug 17 2023, 6:32 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
11229

an example:

Legalized selection DAG: %bb.0 'vpscatter_baseidx_zext_v8i8_v8i16:'
SelectionDAG has 26 nodes:
  t0: ch,glue = EntryToken
  t32: nxv4i1 = RISCVISD::VMSET_VL Constant:i64<8>
      t2: nxv4i16,ch = CopyFromReg t0, Register:nxv4i16 %0
      t6: i64,ch = CopyFromReg t0, Register:i64 %1
          t8: nxv4i8,ch = CopyFromReg t0, Register:nxv4i8 %2
        t33: nxv4i64 = RISCVISD::VZEXT_VL t8, t32, Constant:i64<8>
            t43: nxv4i64 = RISCVISD::VMV_V_X_VL undef:nxv4i64, Constant:i64<1>, Constant:i64<8>
          t44: v8i64 = extract_subvector t43, Constant:i64<0>
        t36: nxv4i64 = insert_subvector undef:nxv4i64, t44, Constant:i64<0>
      t37: nxv4i64 = RISCVISD::SHL_VL t33, t36, undef:nxv4i64, t32, Constant:i64<8>
      t11: nxv4i1,ch = CopyFromReg t0, Register:nxv4i1 %3
        t14: i64,ch = CopyFromReg t0, Register:i64 %4
      t16: i64 = AssertZext t14, ValueType:ch:i32
    t42: ch = llvm.riscv.vsoxei.mask<(store unknown-size, align 2)> t0, TargetConstant:i64<8783>, t2, t6, t37, t11, t16
  t28: ch = RISCVISD::RET_GLUE t42

visitINSERT_SUBVECTOR is executed after this code, and block this optimization, so add these here.

fakepaper56 added inline comments.Aug 20 2023, 12:21 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
11201

It may cause assertion error when N has exactly one operand.

fakepaper56 added inline comments.Aug 20 2023, 4:20 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
11201

I think you could write to

if (N.getOpcode() == RISCVISD::VMV_V_X_VL && isa<ConstantSDNode>(N->getOperand(1))) {
  SplatVal = N->getConstantOperandVal(1);
  return true;
}
fakepaper56 added inline comments.Aug 20 2023, 7:35 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
11229

Thank you for your elaboration. I misunderstood the order of legalization.

liaolucy updated this revision to Diff 553346.Aug 24 2023, 9:10 PM

Address comments, thanks.
Sorry for the late update.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
11201

To preserve the APInt type, I use SplatVal = N->getConstantOperandAPInt(1).

I want to suggest a different approach here. I think this should be a combine on MGATHER (and variants), not a combine on the RISCV specific nodes. It might make sense to have this be a target specific combine, but I don't think it needs to be a combine on custom nodes.

Looking at one of your tests, I see this state immediately before lowering:

    t15: v8i64 = zero_extend t6
    t17: v8i64 = BUILD_VECTOR Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>
  t18: v8i64 = shl t15, t17
t27: v8i16,ch = masked_gather<(load unknown-size, align 2), signed unscaled offset> t0, t12, t9, t2, t18, TargetConstant:i64<1>

The MGATHER node supports scaled offsets, and allows the index type to differ from the native type (i.e. there's implicit extend on the operand). The later seems like it's the right tool here. See refineIndexType in DAGCombine for an example of some code which is related. Basically, we're just inverting the zero_extend placement, and then running that transform. We might even be able to do this in fully generic code.

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
1710

This is the test I'm referencing in my overall comment.

reames added inline comments.Sep 11 2023, 12:57 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
11201

vmv_v_x_vl also supports a passthru operand which may not be undef. You need to check for that to be sure you actually have a splat here.

I want to suggest a different approach here. I think this should be a combine on MGATHER (and variants), not a combine on the RISCV specific nodes. It might make sense to have this be a target specific combine, but I don't think it needs to be a combine on custom nodes.

Looking at one of your tests, I see this state immediately before lowering:

    t15: v8i64 = zero_extend t6
    t17: v8i64 = BUILD_VECTOR Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>
  t18: v8i64 = shl t15, t17
t27: v8i16,ch = masked_gather<(load unknown-size, align 2), signed unscaled offset> t0, t12, t9, t2, t18, TargetConstant:i64<1>

The MGATHER node supports scaled offsets, and allows the index type to differ from the native type (i.e. there's implicit extend on the operand). The later seems like it's the right tool here. See refineIndexType in DAGCombine for an example of some code which is related. Basically, we're just inverting the zero_extend placement, and then running that transform. We might even be able to do this in fully generic code.

Thanks for the advice and trying to understand

0. Base GAG

    t15: v8i64 = zero_extend t6
    t17: v8i64 = BUILD_VECTOR Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>, Constant:i64<1>
  t18: v8i64 = shl t15, t17
t27: v8i16,ch = masked_gather<(load unknown-size, align 2), signed unscaled offset> t0, t12, t9, t2, t18, TargetConstant:i64<1>
  1. inverting the zero_extend placement:
     t29: v8i8 = BUILD_VECTOR Constant:i8<1>, Constant:i8<1>, Constant:i8<1>, Constant:i8<1>, Constant:i8<1>, Constant:i8<1>, Constant:i8<1>, Constant:i8<1>
    t30: v8i8 = shl t6, t29
  t31: v8i64 = zero_extend t30
t27: v8i16,ch = masked_gather<(load unknown-size, align 2), signed unscaled offset> t0, t12, t9, t2, t31, TargetConstant:i64<1>
  1. Use RVV indexed load/store instructions zero extend their indexed operand to XLEN. Delete zero_extend
     t29: v8i8 = BUILD_VECTOR Constant:i8<1>, Constant:i8<1>, Constant:i8<1>, Constant:i8<1>, Constant:i8<1>, Constant:i8<1>, Constant:i8<1>, Constant:i8<1>
    t30: v8i8 = shl t6, t29
t27: v8i16,ch = masked_gather<(load unknown-size, align 2), signed unscaled offset> t0, t12, t9, t2, t30, TargetConstant:i64<1>

But the conversion 0 -> 1 seems to be unequal. https://alive2.llvm.org/ce/z/-acjAs
Please help correct me if I'm not understanding something correctly, thanks!

If we don't consider inverting the zero_extend placement, it seems that we can also combine on MGATHER. I'll update patch later. Thanks.

Please see https://github.com/llvm/llvm-project/pull/66405. This is an alternate implementation of this patch.

liaolucy abandoned this revision.Wed, Nov 22, 1:25 AM