Page MenuHomePhabricator

[AArch64][SVE] Add SVE2 mla indexed intrinsics

Authored by dancgr on Jan 28 2020, 1:06 PM.



Add SVE2 mla indexed intrinsics:

  • smlalb, smalalt, umlalb, umlalt, smlslb, smlslt, umlslb, umlslt.

Diff Detail

Event Timeline

dancgr created this revision.Jan 28 2020, 1:06 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 28 2020, 1:06 PM
dancgr removed a reviewer: dancgr.Jan 28 2020, 1:08 PM
dancgr added a reviewer: kmclaughlin.
dancgr retitled this revision from [AArch64][SVE] Add SVE2 mla indexed intrinsics. to [AArch64][SVE] Add SVE2 mla indexed intrinsics.Jan 28 2020, 1:17 PM

I don't see any testcases for the byte variant? (smlalb z0.h, z1.b, z2.b[0]).

efriedma accepted this revision.Jan 28 2020, 1:28 PM

Err, nevermind, that's not legal.


This revision is now accepted and ready to land.Jan 28 2020, 1:28 PM
This revision was automatically updated to reflect the committed changes.

@dancgr I see you already committed the patch, but could you please still address my comments?


For consistency with other intrinsics, these need to use an i32 value for the immediate. The reason is that the front-end will generate the calls to the intrinsics automatically and will use an i32 for the immediates. Having one exception to the rule means this is something we'll need to fix in Clang or definition of the intrinsic later.


In order to be consistent with other intrinsics that have indexed form (such as int_aarch64_sve_fmlalb_lane), these indexed forms are better named int_aarch64_sve_smlalb_lane, where int_aarch64_sve_smlalb are used for the vectors unpredicated form.

It may come across as if I'm being a bit pedantic here with the naming, but not deviating unnecessarily from our downstream implementation (see D71712 for reference) really helps us when we go and upstream the Clang side of the ACLE (which we're preparing to upstream as we speak). We generate code for Clang to map C/C++ level builtins -> llvm ir intrinsics, and any changes to the intrinsic names like this, we will need to fix up in our mapping and tests. This probably isn't too complicated, but it would be another thing we'd need to fix, so better to avoid this if we can.

I will make the changes suggested by Sander in a following patch, joined with the saturating multiply-add long intirnsics.


Sure, no problem. The reason I chose i64 was because the original InstrFormat class definition used VectorIndexH and VectorIndexS, which are i64 immediates. I will make a switch to i32 formats.


Sure, will update that.