[AArch64][SME] Add load and store instructions
This patch adds support for following contiguous load and store
- LD1B, LD1H, LD1W, LD1D, LD1Q
- ST1B, ST1H, ST1W, ST1D, ST1Q
A new register class and operand is added for the 32-bit vector select
register W12-W15. The differences in the following tests which have been
re-generated are caused by the introduction of this register class:
D88663 attempts to resolve the issue with the store pair test
differences in the AArch64 load/store optimizer.
The GlobalISel differences are caused by changes in the enum values of
register classes, tests have been updated with the new values.
The reference can be found here:
Reviewed By: CarolineConcatto
Differential Revision: https://reviews.llvm.org/D105572