This is an archive of the discontinued LLVM Phabricator instance.

[Clang][AArch64][SME] Add outer product intrinsics
ClosedPublic

Authored by bryanpkc on Sep 26 2022, 2:57 PM.

Details

Summary

This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):

  • svmopa_za32[_bf16]_m // also for s8, u8, f16, f32
  • svmops_za32[_bf16]_m // also for s8, u8, f16, f32
  • svsumopa_za32[_s8]_m
  • svsumops_za32[_s8]_m
  • svusmopa_za32[_u8]_m
  • svusmops_za32[_u8]_m

When the sme-f64f64 feature is enabled, the following intrinsics are supported:

  • svmopa_za64_f64_m
  • svmops_za64_f64_m

When the sme-i16i64 feature is enabled, the following intrinsics are supported:

  • svmopa_za64[_s16]_m // also for u16
  • svmops_za64[_s16]_m // also for u16
  • svsumopa_za64[_s16]_m
  • svsumops_za64[_s16]_m
  • svusmopa_za64[_u16]_m
  • svusmops_za64[_u16]_m

Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptSep 26 2022, 2:57 PM
sagarkulkarni19 requested review of this revision.Sep 26 2022, 2:57 PM

Thanks for the patch. This is going to be inconvenient, sorry, but: while implementing the specification in GCC, I noticed that the ZA functions weren't consistent about whether they had an _m suffix. svwrite (MOVA) had one, but the MOP intrinsics that you're implementing here didn't. Since SME2 does have some unpredicated instructions, it seems like it would be better to make the MOP intrinsics consistent with svwrite, with an _m suffix.

I've created https://github.com/ARM-software/acle/pull/218 for that change. Please let me know if it looks reasonable to you.

Thanks for the patch. This is going to be inconvenient, sorry, but: while implementing the specification in GCC, I noticed that the ZA functions weren't consistent about whether they had an _m suffix. svwrite (MOVA) had one, but the MOP intrinsics that you're implementing here didn't. Since SME2 does have some unpredicated instructions, it seems like it would be better to make the MOP intrinsics consistent with svwrite, with an _m suffix.

I've created https://github.com/ARM-software/acle/pull/218 for that change. Please let me know if it looks reasonable to you.

Thanks for letting me know. I can make the changes to MOP and ADD intrinsics and add a _m suffix.
Yes, this looks reasonable to me.

bryanpkc commandeered this revision.Jan 21 2023, 2:22 PM
bryanpkc added a reviewer: sagarkulkarni19.
bryanpkc updated this revision to Diff 492693.Jan 27 2023, 3:39 AM
bryanpkc retitled this revision from [Clang][AArch64] Add SME outer product intrinsics to [Clang][AArch64][SME] Add outer product intrinsics.
bryanpkc edited the summary of this revision. (Show Details)

Rebased and cleaned up the patch. Also added _m suffix to the intrinsics as required by the amendment in https://github.com/ARM-software/acle/pull/218.

bryanpkc updated this revision to Diff 528291.Jun 4 2023, 11:06 PM
Matt added a subscriber: Matt.Jun 5 2023, 12:24 PM
bryanpkc updated this revision to Diff 540841.Jul 16 2023, 5:58 PM

Removed the attribute macro from tests, as @sdesmalen suggested.

sdesmalen accepted this revision.Jul 17 2023, 6:44 AM
This revision is now accepted and ready to land.Jul 17 2023, 6:44 AM
This revision was landed with ongoing or failed builds.Jul 20 2023, 2:58 AM
This revision was automatically updated to reflect the committed changes.