EDIT: I should note that all this only applies to base+offset addressing modes.
We used to generate stuff like
umov.b w8, v0[2] strb w8, [x0, x1]
because the STR*ro* patterns were preferred to ST1*.
Instead, generate:
add x8, x0, x1 st1.b { v0 }[2], [x8]
This patch increases the ST1* AddedComplexity to achieve that.
Thanks!
-Ahmed