This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SME2/SVE2p1] Add predicate-as-counter intrinsics for ld1/ldnt1/st1/stnt1
ClosedPublic

Authored by sdesmalen on May 19 2023, 3:24 AM.

Details

Summary

These intrinsics are used to implement multi-vector load/store intrinsics that loads
or stores a tuple of 2 or 4 values, based on a predicate-as-counter operand, e.g.

__attribute__((arm_streaming))
svuint8x2_t svld1[_u8]_x2(svcount_t png, const uint8_t *rn);

__attribute__((arm_streaming))
void svst1[_u8_x2](svcount_t png, uint8_t *rn, svuint8x2_t zt);

As described in https://github.com/ARM-software/acle/pull/217

Diff Detail

Event Timeline

sdesmalen created this revision.May 19 2023, 3:24 AM
sdesmalen requested review of this revision.May 19 2023, 3:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 19 2023, 3:24 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
3844

Should this be ZPR2Mul2 same for store_pn_x4 I think it should ZPR4Mul4, because the multivectors are all mul_r?

sdesmalen updated this revision to Diff 523792.May 19 2023, 8:29 AM

Changed register class to ZPR2Mul2, ZPR4Mul4.
Udated the store tests to take a unused argument, which forces all tests to copy input registers to tuple that is a multiple of 2 or 4.
Removed the explicit tests for these cases, as those are now redundant.

sdesmalen added inline comments.May 19 2023, 8:30 AM
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
3844

It should, although oddly it was doing the right thing, probably because of the register class defined for the instruction.

CarolineConcatto accepted this revision.May 22 2023, 5:14 AM

Probably because the test does not use any sve registers. So it can use the ones available.

This revision is now accepted and ready to land.May 22 2023, 5:14 AM
This revision was landed with ongoing or failed builds.May 22 2023, 6:51 AM
This revision was automatically updated to reflect the committed changes.