This patch adds support for lowering a vector.transfer_write of zeroes
and type vector<[16x16]xi8> to the SME zero {za} instruction [1],
which zeroes the entire accumulator, and then writing it out to memory
with the str instruction [2].
This contributes to supporting a path from linalg.fill to SME.
[1] https://developer.arm.com/documentation/ddi0602/2022-06/SME-Instructions/ZERO--Zero-a-list-of-64-bit-element-ZA-tiles-
[2] https://developer.arm.com/documentation/ddi0602/2022-06/SME-Instructions/STR--Store-vector-from-ZA-array-
Could you update "mlir/test/Target/LLVMIR/arm-sme.mlir" as well?