This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Bubble up tensor.extract_slice above linalg operation
ClosedPublic

Authored by okkwon on Mar 24 2022, 2:44 PM.

Details

Summary

Bubble up extract_slice above Linalg operation.

A sequence of operations

%0 = linalg.<op> ... arg0, arg1, ...
%1 = tensor.extract_slice %0 ...

can be replaced with

%0 = tensor.extract_slice %arg0
%1 = tensor.extract_slice %arg1
%2 = linalg.<op> ... %0, %1, ...

This results in the reduce computation of the linalg operation.

The implementation uses the tiling utility functions. One difference
from the tiling process is that we don't need to insert the checking
code for the out-of-bound accesses. The use of the slice itself
represents that the code writer is sure about the boundary condition.
To avoid adding the boundary condtion check code, addBoundaryCheck
is introduced for the tiling utility functions.

Diff Detail

Event Timeline

okkwon created this revision.Mar 24 2022, 2:44 PM
Herald added a reviewer: aartbik. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
okkwon requested review of this revision.Mar 24 2022, 2:44 PM
okkwon edited reviewers, added: mravishankar; removed: rriddle, aartbik.Mar 24 2022, 2:48 PM
mravishankar accepted this revision.Mar 25 2022, 9:33 AM

This is really nice! I am stamping this. Would be really good to wait for @nicolasvasilache to take a look, but this is good to go for me!

mlir/include/mlir/Dialect/Linalg/Utils/Utils.h
175

Do you want this value to be default true ?

mlir/lib/Dialect/Linalg/Transforms/BubbleUpExtractSlice.cpp
29
This revision is now accepted and ready to land.Mar 25 2022, 9:33 AM
okkwon updated this revision to Diff 418261.Mar 25 2022, 9:51 AM
okkwon marked an inline comment as done.

Replace getAsValues with getValueOrCreateConstantIndexOp.

okkwon marked an inline comment as done.Mar 25 2022, 9:51 AM
okkwon added inline comments.
mlir/include/mlir/Dialect/Linalg/Utils/Utils.h
175

Initially I set it to true by default. But I found there are only few cases I need to modify, so I just modified the places instead.

nicolasvasilache accepted this revision.Mar 30 2022, 1:06 AM

Looks good, thanks!

mlir/include/mlir/Dialect/Linalg/Utils/Utils.h
175

Can we rename this omitPartialTileCheck and rewrite the comment using this terminology?
Something like:
"In cases where we statically know that the partial/boundary tile condition is unnecessary, we can omit the emission of "

mlir/lib/Dialect/Linalg/Transforms/BubbleUpExtractSlice.cpp
60

Add a TODO that we could relax this in the future if we want heuristics to detect that all uses are "small" portions of the output ?

82

typo: projected

mlir/lib/Dialect/Linalg/Utils/Utils.cpp
842

Can we invert the logic and create an early exit + comments in the simpler case?

gysit added a comment.Mar 30 2022, 3:23 AM

The pattern looks good to me!

I have one comment regarding padding. When padding we actually want the padded linalg operation to perform redundant computation for the sake of easier / enabling vectorization. Padding thus introduces a tensor.pad before the linalg op and an extract slice after the linalg op. The latter removes the padding from the linalg op result tensor. Applying the bubble pattern to a padded linalg operation would partially undo the padding by moving the extract slice before the linalg op - at least if I understand the pattern correctly - which is most likely not what we want.

I don't think this is a concern as long as the pattern is applied selectively and not as a canonicalization or similiar.

mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
122

nit: typo

okkwon updated this revision to Diff 419192.Mar 30 2022, 9:57 AM

Address the comments

okkwon marked 5 inline comments as done.Mar 30 2022, 10:00 AM

Thanks for the review! Addressed all the TODO items.

@gysit Would you mind giving me a simple example of a padding case? I'd like to understand the situation better. Thanks!

okkwon updated this revision to Diff 419198.Mar 30 2022, 10:09 AM

Revert an unnecessary change on a comment.

Sure, a padded linalg.fill op looks as follows:

mlir
        %9 = tensor.pad %7 low[%c0, %c0] high[%3, %8] {
        ^bb0(%arg8: index, %arg9: index):
          tensor.yield %cst : f32
        } : tensor<?x?xf32> to tensor<8x16xf32>
        %10 = linalg.fill {iree_linalg_transform.matched} ins(%cst : f32) outs(%9 : tensor<8x16xf32>) -> tensor<8x16xf32>
        %11 = tensor.extract_slice %10[0, 0] [%1, %5] [1, 1] : tensor<8x16xf32> to tensor<?x?xf32>

The pad op extends the dynamically sized %7 to 8x16 values and fills the boundary with %cst. The extract slice removes the boundery again and returns a dynamically sized tensor that matches the size of %7.

Thanks Tobias for the info. I believe if the padding is added as an optimization for vectorization. We need to make this transformation happens before it.
If we see more cases that need a fine control, we can add a control function to this pass. Thanks!

okkwon updated this revision to Diff 419266.Mar 30 2022, 2:16 PM

Rebased onto origin/main

Thanks Tobias for the info. I believe if the padding is added as an optimization for vectorization. We need to make this transformation happens before it.
If we see more cases that need a fine control, we can add a control function to this pass. Thanks!

That makes sense!