When we have a producer tensor.insert_slice op and a consumer
tensor.extract_slice op and their slices are disjoint, we can
update the tensor.extract_slice op to use the tensor.insert_slice
op's destination tensor. This helps to break the chain so that
it can further enable optimizations.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/lib/Dialect/Tensor/Transforms/ExtractFromInsertSliceDest.cpp | ||
---|---|---|
19 ↗ | (On Diff #460496) | This is a very specific pattern, and post bufferization this shouldnt matter. It seems very difficult to reason about this in non-static cases (and here this is all for static). |
mlir/lib/Dialect/Tensor/Transforms/ExtractFromInsertSliceDest.cpp | ||
---|---|---|
19 ↗ | (On Diff #460496) | Some dynamic dimensions are fine, but right we need to have at least one static disjoint dimensions to be sure. |
Some dynamic dimensions are fine, but right we need to have at least one static disjoint dimensions to be sure.
I am just saying this seems like something that is addressing a symptom and not the root cause. Maybe root causing it might help avoid such specific patterns.
This kind of IR is generated by tiling and unrolling at the tensor level. Tiling creates the structure of for (...) { extract_slice, compute, insert_slice }. After unrolling the loop there we have consecutive blocks of extract_slice0, compute0, insert_slice0, extract_slice1, .... extract_slice1 would extract from insert_slice0. This patterns breaks the dependence chain here to make future transformations easier (not needing to look through the whole extract/insert op chain). It's a simple pattern that we can explicitly control application. Not doing this would mean we either need to bake unrolling inside tiling, or having later transformations being able to look through the extract/insert chain. To me this way seems cleaner than those.
Move utils to Affine/Utils to avoid introducing ArithUtils dependencies to ViewLikeInterface