This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SME] Add support for Copy/Spill/Fill of strided ZPR2/ZPR4 registers.
ClosedPublic

Authored by sdesmalen on Aug 30 2023, 5:52 AM.

Details

Summary

This patch contains a few changes:

  • It changes the alignment of the strided/contiguous ZPR2/ZPR4 registers to 128-bits. This is important, because when we spill these registers to the stack, the address doesn't need to be 256/512 bits aligned because we split the single-store/reload pseudo instruction up into multiple STR_ZXI/LDR_ZXI (single vector store/load) instructions, which only require a 128-bit alignment. Additionally, an alignment larger than the stack-alignment is not supported for scalable vectors.
  • It adds support for these register classes in storeRegToStackSlot, loadRegFromStackSlot and copyPhysReg.
  • It adds tests only for the strided forms. There is no need to also test the contiguous forms, because a register such as z2_z3 or z4_z5_z6_z7 are also part of the regular ZPR2 and ZPR4 register classes, respectively, which are already covered and tested.

Diff Detail