This is an archive of the discontinued LLVM Phabricator instance.

[SME2][AArch64] Add multi-single multiply-add long long intrinsics
ClosedPublic

Authored by kmclaughlin on Feb 3 2023, 8:56 AM.

Details

Summary

Adds intrinsics for the following SME2 instructions:

  • smlall (1, 2 & 4 vectors)
  • umlall (1, 2 & 4 vectors)
  • smlsll (1, 2 & 4 vectors)
  • umlsll (1, 2 & 4 vectors)
  • sumlall (2 & 4 vectors)
  • usmlall (1, 2 & 4 vectors)
NOTE: These intrinsics are still in development and are subject to future changes.

Diff Detail

Event Timeline

kmclaughlin created this revision.Feb 3 2023, 8:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 3 2023, 8:56 AM
kmclaughlin requested review of this revision.Feb 3 2023, 8:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 3 2023, 8:56 AM
david-arm added inline comments.Feb 8 2023, 8:47 AM
llvm/lib/Target/AArch64/SMEInstrFormats.td
2757

This is just a suggestion, but in theory you could reduce the amount of duplication by creating a multiclass sme2_mla_ll_array_vg24_single like this:

multiclass sme2_mla_ll_array_vg24_single<string mnemonic, bits<5> op,
                                        MatrixOperand matrix_ty,
                                        RegisterOperand multi_vector_ty,
                                        ZPRRegOp zpr_ty, ValueType vt, SDPatternOperator intrinsic> {
  def NAME: sme2_mla_ll_array_vg24_single<op, matrix_ty, multi_vector_ty,
                                        zpr_ty, mnemonic>, SMEPseudo2Instr<NAME, 1>;

  def NAME # _PSEUDO : sme2_za_array_2op_multi_single_pseudo<NAME, uimm1s4range, multi_vector_ty, zpr_ty, SMEMatrixArray>;

  def : InstAlias<mnemonic # "\t$ZAd[$Rv, $imm], $Zn, $Zm",
                 (!cast<Instruction>(NAME) matrix_ty:$ZAd,  MatrixIndexGPR32Op8_11:$Rv, uimm1s4range:$imm, multi_vector_ty:$Zn, zpr_ty:$Zm), 0>;
}

Then for each of vg2 and vg4 you just need:

multiclass sme2_mla_ll_array_vg2_single<string mnemonic, bits<5> op,
                                        MatrixOperand matrix_ty,
                                        RegisterOperand multi_vector_ty,
                                        ZPRRegOp zpr_ty, ValueType vt, SDPatternOperator intrinsic> {
  defm : sme2_mla_ll_array_vg24_single;

  def : SME2_ZA_TwoOp_VG2_Multi_Single_Pat<NAME, intrinsic, uimm1s4range, zpr_ty, vt, tileslicerange1s4>;
}

Please feel free to ignore this suggestion if you think it doesn't improve things. I wouldn't hold up the patch for it!

llvm/test/CodeGen/AArch64/sme2-intrinsics-mlall.ll
3

nit: I don't think we need the -mattr=+sve flag here, since sme2 implies it.

Matt added a subscriber: Matt.Feb 8 2023, 2:36 PM
kmclaughlin marked 2 inline comments as done.
  • Added a sme2_mla_ll_array_vg24_single multiclass to reduce duplication in sme2_mla_ll_array_vg2_single & sme2_mla_ll_array_vg4_single
  • Removed -mattr=+sve from sme2-intrinsics-mlall.ll
david-arm accepted this revision.Feb 15 2023, 8:59 AM

LGTM! Muy bueno!

This revision is now accepted and ready to land.Feb 15 2023, 8:59 AM