This is an archive of the discontinued LLVM Phabricator instance.

[mlir][linalg] Swap tensor.extract_slice(linalg.fill)
ClosedPublic

Authored by antiagainst on Sep 16 2022, 7:57 PM.

Details

Summary

This commit adds a pattern to swap

tensor.extract_slice(linalg.fill(%cst, %init))

into

linalg.fill(%cst, tensor.extract_slice(%init))

when the linalg.fill op have no other users.
This helps to reduce the fill footprint.

Diff Detail

Event Timeline

antiagainst created this revision.Sep 16 2022, 7:57 PM
antiagainst requested review of this revision.Sep 16 2022, 7:57 PM
mravishankar requested changes to this revision.Sep 16 2022, 8:11 PM

I am apprehensive of adding this as a canonicalization. Canonicalization that involves multiple operations has always been a footgun

This revision now requires changes to proceed.Sep 16 2022, 8:11 PM

+1, please just create a new file with a proper name (and suffixed with Patterns.cpp) and let's trigger this on-demand.
It is quite close to a canonicalization but it is not really a canonicalization and can be surprising..

Friday evening my brain doesnt work as well. This is basically a specialized implementation of https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h#L29 which works on any operation that implements the TilingInterface. If you can just use that method, we dont need to diverge (but that does introduce a more heavy weight treatment of this, and the implementation here is simpler). Either way, this shouldnt be a canonicalization.

Address comments

Address comments

Fix comments

@mravishankar @nicolasvasilache : moved to a separate file and entry point now!

Friday evening my brain doesnt work as well. This is basically a specialized implementation of https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Dialect/Tensor/Transforms/Transforms.h#L29 which works on any operation that implements the TilingInterface. If you can just use that method, we dont need to diverge (but that does introduce a more heavy weight treatment of this, and the implementation here is simpler). Either way, this shouldnt be a canonicalization.

Thanks for the suggestion! I tried to use that but found the above 1) does not support non-1 strides, & 2) cannot handle rank-reducing extract_slice ops. I'm not sure how easy to extend the above for both; esp. the above is used for tiling, which later bufferization relies on the tiled structure. OTOH, this pattern itself is already very simple; just a few lines. I'd prefer to keep the current impl.

mravishankar accepted this revision.Sep 20 2022, 1:31 PM
This revision is now accepted and ready to land.Sep 20 2022, 1:31 PM
This revision was automatically updated to reflect the committed changes.