Adds intrinsics for the following SME2 instructions:
- smlall (1, 2 & 4 vectors)
- umlall (1, 2 & 4 vectors)
- smlsll (1, 2 & 4 vectors)
- umlsll (1, 2 & 4 vectors)
- sumlall (2 & 4 vectors)
- usmlall (1, 2 & 4 vectors)
NOTE: These intrinsics are still in development and are subject to future changes.
This is just a suggestion, but in theory you could reduce the amount of duplication by creating a multiclass sme2_mla_ll_array_vg24_single like this:
Then for each of vg2 and vg4 you just need:
Please feel free to ignore this suggestion if you think it doesn't improve things. I wouldn't hold up the patch for it!