This is an archive of the discontinued LLVM Phabricator instance.

[AArch64]SME2 Non-contiguous load and store
AbandonedPublic

Authored by CarolineConcatto on Oct 20 2022, 2:50 AM.

Details

Summary

This patch adds the assembly/disassembly for the following instruction:

Non-constiguous load with stride resgisters:

LD1B (scalar + immediate): Contiguous load of bytes to multiple strided vectors (immediate index).
     (scalar + scalar): Contiguous load of bytes to multiple strided vectors (scalar index).
LD1D (scalar + immediate): Contiguous load of doublewords to multiple strided vectors (immediate index).
     (scalar + scalar): Contiguous load of doublewords to multiple strided vectors (scalar index).
LD1H (scalar + immediate): Contiguous load of halfwords to multiple strided vectors (immediate index).
     (scalar + scalar): Contiguous load of halfwords to multiple strided vectors (scalar index).
LD1W (scalar + immediate): Contiguous load of words to multiple strided vectors (immediate index).
     (scalar + scalar): Contiguous load of words to multiple strided vectors (scalar index).

LDNT1B (scalar + immediate): Contiguous load non-temporal of bytes to multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous load non-temporal of bytes to multiple strided vectors (scalar index).
LDNT1D (scalar + immediate): Contiguous load non-temporal of doublewords to multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous load non-temporal of doublewords to multiple strided vectors (scalar index).
LDNT1H (scalar + immediate): Contiguous load non-temporal of halfwords to multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous load non-temporal of halfwords to multiple strided vectors (scalar index).
LDNT1W (scalar + immediate): Contiguous load non-temporal of words to multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous load non-temporal of words to multiple strided vectors (scalar index).

Non-constiguous store with stride resgisters:

ST1B (scalar + immediate): Contiguous store of bytes from multiple strided vectors (immediate index).
     (scalar + scalar): Contiguous store of bytes from multiple strided vectors (scalar index).
ST1D (scalar + immediate): Contiguous store of doublewords from multiple strided vectors (immediate index).
     (scalar + scalar): Contiguous store of doublewords from multiple strided vectors (scalar index).
ST1H (scalar + immediate): Contiguous store of halfwords from multiple strided vectors (immediate index).
     (scalar + scalar): Contiguous store of halfwords from multiple strided vectors (scalar index).
ST1W (scalar + immediate): Contiguous store of words from multiple strided vectors (immediate index).
     (scalar + scalar): Contiguous store of words from multiple strided vectors (scalar index).

STNT1B (scalar + immediate): Contiguous store non-temporal of bytes from multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous store non-temporal of bytes from multiple strided vectors (scalar index).
STNT1D (scalar + immediate): Contiguous store non-temporal of doublewords from multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous store non-temporal of doublewords from multiple strided vectors (scalar index).
STNT1H (scalar + immediate): Contiguous store non-temporal of halfwords from multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous store non-temporal of halfwords from multiple strided vectors (scalar index).
STNT1W (scalar + immediate): Contiguous store non-temporal of words from multiple strided vectors (immediate index).
       (scalar + scalar): Contiguous store non-temporal of words from multiple strided vectors (scalar index).

The reference can be found here:

https://developer.arm.com/documentation/ddi0602/2022-09

This patch also adds a new SVE vector list to represent the stride loads/stores
ZPRVectorListStrided and the sets of 2 and 4 ZA registers:
ZZ_[b|h|w|d]_strided and ZZZZ_[b|h|w|d]_strided

Depends on: D136172

Diff Detail