Adds intrinsics for the following SME2 instructions:
- smlall (1, 2 & 4 vectors)
- umlall (1, 2 & 4 vectors)
- smlsll (1, 2 & 4 vectors)
- umlsll (1, 2 & 4 vectors)
- sumlall (2 & 4 vectors)
- usmlall (1, 2 & 4 vectors)
NOTE: These intrinsics are still in development and are subject to future changes.
This is just a suggestion, but in theory you could reduce the amount of duplication by creating a multiclass sme2_mla_ll_array_vg24_single like this:
multiclass sme2_mla_ll_array_vg24_single<string mnemonic, bits<5> op, MatrixOperand matrix_ty, RegisterOperand multi_vector_ty, ZPRRegOp zpr_ty, ValueType vt, SDPatternOperator intrinsic> { def NAME: sme2_mla_ll_array_vg24_single<op, matrix_ty, multi_vector_ty, zpr_ty, mnemonic>, SMEPseudo2Instr<NAME, 1>; def NAME # _PSEUDO : sme2_za_array_2op_multi_single_pseudo<NAME, uimm1s4range, multi_vector_ty, zpr_ty, SMEMatrixArray>; def : InstAlias<mnemonic # "\t$ZAd[$Rv, $imm], $Zn, $Zm", (!cast<Instruction>(NAME) matrix_ty:$ZAd, MatrixIndexGPR32Op8_11:$Rv, uimm1s4range:$imm, multi_vector_ty:$Zn, zpr_ty:$Zm), 0>; }Then for each of vg2 and vg4 you just need:
multiclass sme2_mla_ll_array_vg2_single<string mnemonic, bits<5> op, MatrixOperand matrix_ty, RegisterOperand multi_vector_ty, ZPRRegOp zpr_ty, ValueType vt, SDPatternOperator intrinsic> { defm : sme2_mla_ll_array_vg24_single; def : SME2_ZA_TwoOp_VG2_Multi_Single_Pat<NAME, intrinsic, uimm1s4range, zpr_ty, vt, tileslicerange1s4>; }Please feel free to ignore this suggestion if you think it doesn't improve things. I wouldn't hold up the patch for it!