This is an archive of the discontinued LLVM Phabricator instance.

[mlir][Linalg] Add a first vectorization pattern for conv1d in NWCxWCF format.
ClosedPublic

Authored by nicolasvasilache on Oct 15 2021, 9:12 AM.

Details

Summary

This revision uses the newly refactored StructuredGenerator to create a simple vectorization for conv1d_nwc_wcf.

Note that the pattern is not specific to the op and is technically not even specific to the ConvolutionOpInterface (modulo minor details related to dilations and strides).

The overall design follows the same ideas as the lowering of vector::ContractionOp -> vector::OuterProduct: it seeks to be minimally complex, composable and extensible while avoiding inference analysis. Instead, we metaprogram the maps/indexings we expect and we match against them.

This is just a first stab and still needs to be evaluated for performance.
Other tradeoffs are possible that should be explored.

Diff Detail

Event Timeline

nicolasvasilache requested review of this revision.Oct 15 2021, 9:12 AM

Very nice utilities like StructuredGenerator, iters, layout! Thanks a lot for putting this up! It's much clear to me to understand how to use them now. I just have a few questions inlined.

Do you want to land this? Or is this just for illustration purpose? I can change my patch (which also handles higher-D cases) to follow the structure here.

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
1465

This TODO is a bit confusing to me. We are already doing this right? You mean not?

1485

We don't need to check the region to see what is the payload computation?

1535

I think we don't want to read from resShaped, which is updated later at L1539; we want to read the initial output. We are updating non-opverlapping slices so it's fine to read the init output. The problem of reading from the continuously updated resShaped could confuse vector transformations as it causes interleaving read/write.

Very nice utilities like StructuredGenerator, iters, layout! Thanks a lot for putting this up! It's much clear to me to understand how to use them now. I just have a few questions inlined.

Do you want to land this? Or is this just for illustration purpose? I can change my patch (which also handles higher-D cases) to follow the structure here.

Yes, this is meant to land, I have another few changes incoming which allowed me to plug everything together end-to-end and benchmark.
Feel free to rebase on this and add other things that you need on top of this current structure.

nicolasvasilache marked 3 inline comments as done.Oct 19 2021, 7:24 AM
nicolasvasilache added inline comments.
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
1465

was meant to convey: "address the fact that ..."
rephrased to make it more readable.

1535

interestingly this actually improves vector transformations because the vector hoisting patterns do not work as expected if I always read from the same element but continuously write into the output: these patterns really want to see the same tensor being read/written.
This is roughly a 8x performance difference.

nicolasvasilache marked 2 inline comments as done.

Address review.
Refactor to properly integrate into vectorizeLinalgOp which is used by other patterns down the line.

ftynse accepted this revision.Oct 20 2021, 6:36 AM
This revision is now accepted and ready to land.Oct 20 2021, 6:36 AM
antiagainst accepted this revision.Oct 20 2021, 7:01 AM
antiagainst added inline comments.
mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
1535

Hmm, that's not what I've seen previously... But maybe things have changed quite a bit. I'll also do some experiments on my side later.

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
1535

note that the behavior I describe is on tensors, in your experiments, did you also do these vectorizations on tensors ?
I'll send you an internal repro to see this.