This is an archive of the discontinued LLVM Phabricator instance.

[LV] Model stride in VPWidenMemoryInstructionRecipe [nfc]
AbandonedPublic

Authored by reames on Apr 14 2023, 3:27 PM.

Details

Reviewers
fhahn
Ayal
Summary

We currently allow two strides 1 for consecutive accesses, and -1 for reverse consecutive access. In the near future, I want to add support for other constant strides, and getting the API worked through simplifies that work. This is admittedly somewhat debatable on its own merits as an NFC, but I figured I'd put it out and see what reviewers thought.

Diff Detail

Event Timeline

reames created this revision.Apr 14 2023, 3:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2023, 3:27 PM
reames requested review of this revision.Apr 14 2023, 3:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2023, 3:27 PM
fhahn added a comment.Apr 21 2023, 9:44 AM

Do you have any particular examples in mind? At the moment, strides > 1 would be handled as interleave group I think and maybe that would also work.

reames added a comment.May 1 2023, 8:15 AM

Do you have any particular examples in mind? At the moment, strides > 1 would be handled as interleave group I think and maybe that would also work.

Constant strides greater than 8 currently end up as masked loads or stores, and the addressing for a masked load vs a strided load is different. The latter only uses the first lane of the pointer vector.

We could also use it for strided stores (even less than stride 8). On a target with wide loads, but not masking, this would enable strided patterns entirely. On a target with wide memory ops and masking there can still be a performance difference between strided access and masking.

(For context, RISCV has a native strided load and store instruction.)

And all of the above is really just a building block to removing the stride==1 speculation and handling runtime strides. This is the case I actually care about, I just need to get the codegen part into acceptable shape first. (This shows up in spec2017 x264 in several cases.)

reames abandoned this revision.May 15 2023, 10:44 AM