This patch adds intrinsics for SVE gather loads for which the offsets are 32-bits wide and are:
- unscaled
- @llvm.aarch64.sve.ld1.gather.sxtw
- @llvm.aarch64.sve.ld1.gather.uxtw
- scaled (offsets become indices)
- @llvm.arch64.sve.ld1.gather.sxtw.index
- @llvm.arch64.sve.ld1.gather.uxtw.index
The offsets are either zero (uxtw) or sign (sxtw) extended to 64 bits.
These intrinsics map 1-1 to the corresponding SVE instructions (examples for half-words):
- unscaled
- ld1h { z0.s }, p0/z, [x0, z0.s, sxtw]
- ld1h { z0.s }, p0/z, [x0, z0.s, uxtw]
- scaled
- ld1h { z0.s }, p0/z, [x0, z0.s, sxtw #1]
- ld1h { z0.s }, p0/z, [x0, z0.s, uxtw #1]
The sxtw/uxtw/uxtw_index/sxtw_index intrinsics all share the same intrinsic signature.
You can create a class for that, and derive from that, similar to what has been done for the other intrinsics.