This is an archive of the discontinued LLVM Phabricator instance.

[mlir][tensor] Add pattern to extract from insert_slice destination
Changes PlannedPublic

Authored by antiagainst on Sep 14 2022, 8:59 AM.

Details

Summary

When we have a producer tensor.insert_slice op and a consumer
tensor.extract_slice op and their slices are disjoint, we can
update the tensor.extract_slice op to use the tensor.insert_slice
op's destination tensor. This helps to break the chain so that
it can further enable optimizations.

Diff Detail

Event Timeline

antiagainst created this revision.Sep 14 2022, 8:59 AM
antiagainst requested review of this revision.Sep 14 2022, 8:59 AM

Move the helper function to ViewLikeInterface

mravishankar added inline comments.Sep 19 2022, 1:41 PM
mlir/lib/Dialect/Tensor/Transforms/ExtractFromInsertSliceDest.cpp
19 ↗(On Diff #460496)

This is a very specific pattern, and post bufferization this shouldnt matter. It seems very difficult to reason about this in non-static cases (and here this is all for static).

antiagainst marked an inline comment as done.Sep 20 2022, 11:21 AM
antiagainst added inline comments.
mlir/lib/Dialect/Tensor/Transforms/ExtractFromInsertSliceDest.cpp
19 ↗(On Diff #460496)

Some dynamic dimensions are fine, but right we need to have at least one static disjoint dimensions to be sure.

antiagainst marked an inline comment as done.

Rebase

Some dynamic dimensions are fine, but right we need to have at least one static disjoint dimensions to be sure.

I am just saying this seems like something that is addressing a symptom and not the root cause. Maybe root causing it might help avoid such specific patterns.

Some dynamic dimensions are fine, but right we need to have at least one static disjoint dimensions to be sure.

I am just saying this seems like something that is addressing a symptom and not the root cause. Maybe root causing it might help avoid such specific patterns.

This kind of IR is generated by tiling and unrolling at the tensor level. Tiling creates the structure of for (...) { extract_slice, compute, insert_slice }. After unrolling the loop there we have consecutive blocks of extract_slice0, compute0, insert_slice0, extract_slice1, .... extract_slice1 would extract from insert_slice0. This patterns breaks the dependence chain here to make future transformations easier (not needing to look through the whole extract/insert op chain). It's a simple pattern that we can explicitly control application. Not doing this would mean we either need to bake unrolling inside tiling, or having later transformations being able to look through the extract/insert chain. To me this way seems cleaner than those.

antiagainst planned changes to this revision.Sep 27 2022, 8:36 AM

Not immediately needed anymore; so taking back for now.

Move utils to Affine/Utils to avoid introducing ArithUtils dependencies to ViewLikeInterface

antiagainst planned changes to this revision.Oct 8 2022, 10:38 AM
antiagainst planned changes to this revision.Feb 24 2023, 2:12 PM