Adds intrinsics for the following:
- smopa / smops
- umopa / umops
- bmopa / bmops
Tests for existing SME mopa/mops intrinsics have also been updated
to use the maximum allowed ZA tile number.
NOTE: These intrinsics are still in development and are subject
to future changes.
Not saying you should do this as part of this patch, but I wonder if at some point we should change the existing SME smopa/umopa_wide intrinsics to also use int_aarch64_sme_smopa_za32, etc. For example, the smopa (4-way) 8-bit variant.