This is an archive of the discontinued LLVM Phabricator instance.

[AArch64]SME2 Multi-single vector SVE Destructive 2 and 4 Registers
ClosedPublic

Authored by CarolineConcatto on Oct 10 2022, 1:51 AM.

Details

Summary

This patch adds the assembly/disassembly for the following instructions:

ADD (to vector): Add replicated single vector to multi-vector with multi-vector result.
SQDMULH (multiple and single vector): Multi-vector signed saturating doubling multiply high by vector.

for 2 and 4 ZA SVE registers.

The reference can be found here:

https://developer.arm.com/documentation/ddi0602/2022-09

It also adds more size for the multiple register tuple:

ZZ_b_mul_r,  ZZ_h_mul_r,
ZZZZ_b_mul_r,  ZZZZ_h_mul_r,

for 8 bits and 16 bits with 2 and 4 ZA registers.

Depends on: D135468

Diff Detail

Event Timeline

CarolineConcatto requested review of this revision.Oct 10 2022, 1:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2022, 1:51 AM
  • Add missing diagnostics for the ZA multiple operands
Matt added a subscriber: Matt.Oct 10 2022, 10:52 PM
dmgreen added inline comments.
llvm/test/MC/AArch64/SME2/add.s
47

Your find-and-replace looks like it went a bit too far :)

sdesmalen added inline comments.Oct 19 2022, 7:03 AM
llvm/lib/Target/AArch64/AArch64RegisterInfo.td
1100

Using ZPR2 here (and ZPR4 for ZZZZ_b_mul_r) isn't correct.

ZPR2 allows:

{z0.b, z1.b}
{z1.b, z2.b}
{z2.b, z3.b}
{z3.b, z4.b}
...

But {z1.b, z2.b} and {z3.b, z4.b} are not valid for ZZ_b_mul_r, because the first register must be a multiple of 2.

To fix this, you can create a new register class that only takes the "even" pairs (for ZPR2) or every fourth quad (for ZPR4) like this:

// SME2 multiple-of-2 or 4 multi-vector operands
def ZPR2Mul2 : RegisterClass<"AArch64", [untyped], 128, (add (decimate ZSeqPairs, 2))> {
  let Size = 256;
}

def ZPR4Mul4 : RegisterClass<"AArch64", [untyped], 128, (add (decimate ZSeqQuads, 4))> {
  let Size = 512;
}
david-arm added inline comments.Oct 19 2022, 7:10 AM
llvm/lib/Target/AArch64/AArch64RegisterInfo.td
1100

This makes sense, although the diagnostics tests do show that we correctly reject registers starting at non-multiples I think? Not sure why this works - perhaps the ParserMatchClass takes care of it?

sdesmalen added inline comments.Oct 19 2022, 7:18 AM
llvm/lib/Target/AArch64/AArch64RegisterInfo.td
1100

That's indeed because of the ParserMatchClass (through the PredicateMethod), which is only relevant to the assembler.

CarolineConcatto marked an inline comment as done.

-Rebase on top of D135468

sdesmalen accepted this revision.Oct 20 2022, 7:00 AM

LGTM

llvm/test/MC/AArch64/SME2/add.s
1349

nit: redundant newline here and in other places in this file.

This revision is now accepted and ready to land.Oct 20 2022, 7:00 AM
This revision was landed with ongoing or failed builds.Oct 20 2022, 10:55 AM
This revision was automatically updated to reflect the committed changes.
CarolineConcatto marked an inline comment as done.