This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Linalg] Refactor vectorization of conv1d more aggressively.
ClosedPublic

Authored by nicolasvasilache on Oct 29 2021, 8:02 AM.

Details

Summary

This is more of a proof of concept for now although it is correct and does not result in noticeable perf degradations.

This is what a better decoupling of transfer read/write from vectorization of conv would look like. This form is close to ready to plop into a new vector.conv op and the vector.transfer operations to be generalized as part of generic vectorization once the properties ConvolutionOpInterface are inferred from the indexing maps.

Diff Detail

Event Timeline

nicolasvasilache requested review of this revision.Oct 29 2021, 8:02 AM
ThomasRaoux accepted this revision.Nov 1 2021, 3:34 PM
This revision is now accepted and ready to land.Nov 1 2021, 3:34 PM
antiagainst accepted this revision.Nov 2 2021, 6:51 AM

Nice! In my impl in IREE I was sort of doing this load-whole-and-extract-slice pattern, but only for filters. I'd assume it should be helpful because we load the full filter ahead and put it in registers to increase reuse? But CPU might differ here.