This commit adds a pattern to vectorize Linalg convolution
ops implementing the ConvolutionOpInterface. It is able to
handle n-D convolution with different layouts and strides.
The pattern directly lowers the convolution op into vector
transfer read/write/contract ops by effectively unrolling
convolution output window dimensions.
This is just one step in the convolution CodeGen, where we
expect to go from a high-level convolution op, convert it
to a proper linalg.conv named op, tile the linalg.conv op
a few times, and then tile the filter window dimensions by
size 1. Afterwards this pattern can kick in. So it is
assuming all size-1 filter window dimensions.
There are certainly other ways we can lower convolution
ops, bigger divergent approaches like img2col, indirect
convolution, fourier or Winograd implementations. This
approach is orthogonal to those and does not preclude them.
Though, this pattern enables a direct tiling and vectorization
path, which assumes minimal capabilities from the hardware
(just vector load/store/arithmetics able to support matmul),
so it's expected to be a good default for various targets
fitting that.
Depends On D111721
You wrote in the description that:
But how would these different strategies pan out behind this API (which seems totally generically named)