These intrinsics are used to implement multi-vector load/store intrinsics that loads
or stores a tuple of 2 or 4 values, based on a predicate-as-counter operand, e.g.
__attribute__((arm_streaming)) svuint8x2_t svld1[_u8]_x2(svcount_t png, const uint8_t *rn); __attribute__((arm_streaming)) void svst1[_u8_x2](svcount_t png, uint8_t *rn, svuint8x2_t zt);
As described in https://github.com/ARM-software/acle/pull/217
Should this be ZPR2Mul2 same for store_pn_x4 I think it should ZPR4Mul4, because the multivectors are all mul_r?