In DAGTypeLegalizer::GenWidenVectorStores the algorithm assumes it only
ever deals with fixed width types, hence the offsets for each individual
store never take 'vscale' into account. I've changed the main loop in
that function to use TypeSize instead of unsigned for tracking the
remaining store amount and offset increment. In addition, I've changed
the loop to use the new IncrementPointer helper function for updating
the addresses in each iteration, since this handles scalable vector
types.
Whilst fixing this function I also fixed a minor issue in
IncrementPointer whereby we were not adding the no-unsigned-wrap flag
for the add instruction in the same way as the fixed width case does.
Also, I've added a report_fatal_error in GenWidenVectorTruncStores,
since this code currently uses a sequence of element-by-element scalar
stores.
I've added new tests in
CodeGen/AArch64/sve-intrinsics-stores.ll CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll
for the changes in GenWidenVectorStores.
getAlign(), not getOriginalAlign(), I think? Try writing a testcase where the store is at some offset inside a global variable, or something like that, and the difference should be clear.
Also, ideally, we'd continue to use plain getOriginalAlign() for the non-scalable case. This dance with the alignment is only necessary because we can't specify a scalable offset in the MachinePointerInfo. Maybe IncrementPointer should be involved in this somehow?