We currently allow two strides 1 for consecutive accesses, and -1 for reverse consecutive access. In the near future, I want to add support for other constant strides, and getting the API worked through simplifies that work. This is admittedly somewhat debatable on its own merits as an NFC, but I figured I'd put it out and see what reviewers thought.
Diff Detail
Unit Tests
Event Timeline
Do you have any particular examples in mind? At the moment, strides > 1 would be handled as interleave group I think and maybe that would also work.
Constant strides greater than 8 currently end up as masked loads or stores, and the addressing for a masked load vs a strided load is different. The latter only uses the first lane of the pointer vector.
We could also use it for strided stores (even less than stride 8). On a target with wide loads, but not masking, this would enable strided patterns entirely. On a target with wide memory ops and masking there can still be a performance difference between strided access and masking.
(For context, RISCV has a native strided load and store instruction.)
And all of the above is really just a building block to removing the stride==1 speculation and handling runtime strides. This is the case I actually care about, I just need to get the codegen part into acceptable shape first. (This shows up in spec2017 x264 in several cases.)