This is an archive of the discontinued LLVM Phabricator instance.

Refactor GenericPadTensorOpVectorizationPattern
ClosedPublic

Authored by cathyzhyi on Jul 1 2021, 9:35 AM.

Details

Summary

Refactor the original code to rewrite a PadTensorOp into a
sequence of InitTensorOp, FillOp and InsertSliceOp without
vectorization by default. GenericPadTensorOpVectorizationPattern
provides a customized OptimizeCopyFn to vectorize the
copying step.

Diff Detail

Event Timeline

cathyzhyi created this revision.Jul 1 2021, 9:35 AM
cathyzhyi requested review of this revision.Jul 1 2021, 9:35 AM
cathyzhyi updated this revision to Diff 355917.Jul 1 2021, 9:39 AM

delete redundant comments

cathyzhyi updated this revision to Diff 355925.Jul 1 2021, 9:53 AM

remove redundant type check

silvas requested changes to this revision.Jul 1 2021, 10:14 AM

Can you add an integration test analogous to mlir/test/Integration/Dialect/Linalg/CPU/test-subtensor-insert.mlir ?

mlir/lib/Dialect/Linalg/Transforms/Bufferize.cpp
223 ↗(On Diff #355925)

nit: Prefer PadTensorOp::Adaptor

237 ↗(On Diff #355925)

nit: use llvm::hasSingleElement

256 ↗(On Diff #355925)

nit: Use . at the end of the comment. Same with the other comments.

https://llvm.org/docs/CodingStandards.html#commenting

mlir/test/Dialect/Linalg/bufferize.mlir
270 ↗(On Diff #355925)

can you give meaningful names to the CHECK values?

This revision now requires changes to proceed.Jul 1 2021, 10:14 AM

Note that I asked for a similar rewrite in https://reviews.llvm.org/D102804 but it was eventually not done.
I am wondering whether the pattern you add here should be more generally be the rewrite on tensors I mentioned and then just let existing bufferization kick in?

silvas added a comment.Jul 1 2021, 1:50 PM

Note that I asked for a similar rewrite in https://reviews.llvm.org/D102804 but it was eventually not done.
I am wondering whether the pattern you add here should be more generally be the rewrite on tensors I mentioned and then just let existing bufferization kick in?

That makes sense to me!

cathyzhyi updated this revision to Diff 356010.Jul 1 2021, 2:00 PM
cathyzhyi marked 3 inline comments as done.

address comments

@cathyzhyi I copyhacked your code into a tensor rewrite here: https://reviews.llvm.org/D105317

It seems fine modulo canonicalizations missing on memref::DimOp + tensor (which also interferes with tensor::DimOp that @springerm is looking at).

Feel free to take whatever makes sense to you and refactor / land.
Depending on where you go with this I'll adapt.

Will pick it up again tomorrow.

Thanks for pushing this!

Note that I asked for a similar rewrite in https://reviews.llvm.org/D102804 but it was eventually not done.
I am wondering whether the pattern you add here should be more generally be the rewrite on tensors I mentioned and then just let existing bufferization kick in?

That makes sense to me!

+1 thanks for the suggestion!

cathyzhyi updated this revision to Diff 356084.Jul 1 2021, 7:13 PM
cathyzhyi retitled this revision from Add bufferization for linalg.pad_tensor op to Rewrite PadTensor before bufferizing.
cathyzhyi edited the summary of this revision. (Show Details)

use rewrite pattern instead

cathyzhyi added a comment.EditedJul 1 2021, 7:18 PM

@nicolasvasilache seems with the rewrite pattern version the integration test passed but there are some failures in the unit test. Will need to take a closer look tmr. Here is the unit test.

func @pad_tensor(%arg0: tensor<4x?x2x?xf32>, %arg1: index) -> tensor<4x?x?x?xf32> {
  %cst = constant 0.0 : f32
  %out = linalg.pad_tensor %arg0 low[0, 0, %arg1, 0] high[0, 0, 0, %arg1]  {
  ^bb0(%gen_arg1: index, %gen_arg2: index, %gen_arg3: index, %gen_arg4: index):  // no predecessors
    linalg.yield %cst : f32
  } : tensor<4x?x2x?xf32> to tensor<4x?x?x?xf32>
  return %out : tensor<4x?x?x?xf32>
}
springerm requested changes to this revision.Jul 1 2021, 7:30 PM

The functionality of this pattern is already provided by GenericPadTensorOpVectorizationPattern. It also translates a PadTensorOp into InitTensorOp + FillOp + InsertSliceOp. However, GenericPadTensorOpVectorizationPattern also does a bit more: It tries to generate vectorized alternatives to FillOp and InsertSliceOp. Only if there is not enough static type information, it generates FillOp and InsertSliceOp.

See https://reviews.llvm.org/D103679 for details. (The pattern was extended in subsequent commits, so check on Github for the most recent version.)

Can you run the vectorization pass or is vectorization not desired in your use case? If you want just InitTensorOp + FillOp + InsertSliceOp and no vectorization, I think there should be a way to avoid duplicating that functionality.

This revision now requires changes to proceed.Jul 1 2021, 7:30 PM

@nicolasvasilache seems with the rewrite pattern version the integration test passed but there are some failures in the unit test. Will need to take a closer look tmr. Here is the unit test.

func @pad_tensor(%arg0: tensor<4x?x2x?xf32>, %arg1: index) -> tensor<4x?x?x?xf32> {
  %cst = constant 0.0 : f32
  %out = linalg.pad_tensor %arg0 low[0, 0, %arg1, 0] high[0, 0, 0, %arg1]  {
  ^bb0(%gen_arg1: index, %gen_arg2: index, %gen_arg3: index, %gen_arg4: index):  // no predecessors
    linalg.yield %cst : f32
  } : tensor<4x?x2x?xf32> to tensor<4x?x?x?xf32>
  return %out : tensor<4x?x?x?xf32>
}

It seems the version @springerm has for vectorization is already fitting the bill and would need to be split out from the vectorization part.
@springerm , indeed if we get to bufferization and still have this form, we should bufferize without relying on vectorization.

After looking at this in more detail, the easiest way would be to add a boolean (template) parameter to GenericPadTensorOpVectorizationPattern, which enables or disables vectorization (and maybe rename the pattern). Then register the pattern in applyEnablingTransformations. However, then we would have a dependency of ComprehensiveBufferize.cpp on Vectorization.cpp. Not sure if that is a good idea. I don't see a better way of factoring out common functionality.

Alternatively, just duplicate the pattern (without vectorization) as this revision does at the moment. With the above code suggestion, we would be duplicating around 50 lines of code. Maybe not too bad.

@nicolasvasilache What do you think?

mlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
2490–2507 ↗(On Diff #356084)

Can be replaced with:

Value paddingValue = op.getConstantPaddingValue();
if (!paddingValue)
  return failure();

After looking at this in more detail, the easiest way would be to add a boolean (template) parameter to GenericPadTensorOpVectorizationPattern, which enables or disables vectorization (and maybe rename the pattern). Then register the pattern in applyEnablingTransformations. However, then we would have a dependency of ComprehensiveBufferize.cpp on Vectorization.cpp. Not sure if that is a good idea. I don't see a better way of factoring out common functionality.

Alternatively, just duplicate the pattern (without vectorization) as this revision does at the moment. With the above code suggestion, we would be duplicating around 50 lines of code. Maybe not too bad.

@nicolasvasilache What do you think?

Indeed, so what I have in my local branch is quite close:

  1. the FillOp is not a "tryVectorize" but just a "createFillOrGenerateOp", it is unconditionally always there
  2. I just deleted the copy vectorization.

It seems this could be refactored and moved to Linalg/Transforms/Transforms.h and take a lambda for the copy.
By default it would take nothing (to just lower to other tensor ops).
The vectorization pattern would just reconfigure the pattern to add its own copy vectorization mechanism.

cathyzhyi updated this revision to Diff 356230.Jul 2 2021, 11:18 AM
cathyzhyi retitled this revision from Rewrite PadTensor before bufferizing to Refactor GenericPadTensorOpVectorizationPattern.
cathyzhyi edited the summary of this revision. (Show Details)

refactor and reuse existing code as suggested.

cathyzhyi updated this revision to Diff 356232.Jul 2 2021, 11:28 AM

update comments that's no longer relevant.

@silvas @nicolasvasilache @springerm Any suggestion on which pass to put this pattern in? There is a LinalgGeneralizationPass which currently only applies to patterns lowering to Linalg.generic.

@silvas @nicolasvasilache @springerm Any suggestion on which pass to put this pattern in? There is a LinalgGeneralizationPass which currently only applies to patterns lowering to Linalg.generic.

The pattern will be used by whichever passes need it, such as ComprehensiveBufferize, Bufferize. (you should be able to just add this pattern into Bufferize.cpp in your original patch and let dialect conversion just work.

cathyzhyi updated this revision to Diff 356246.Jul 2 2021, 12:26 PM

add the pattern into linalg bufferization pass

Oh, sorry, I thought you meant which woudl use it. It would be nice if it was in a generic helper file. You can probably put it next to PadTensorOpTransformationPattern for consistency.

cathyzhyi updated this revision to Diff 356269.EditedJul 2 2021, 1:39 PM

move the new pattern's source code next to PadTensorOpTransformationPattern for
consistency

silvas accepted this revision.Jul 2 2021, 1:41 PM

LGTM. Wait for @springerm / @nicolasvasilache approval too.

nicolasvasilache accepted this revision.Jul 2 2021, 2:00 PM
springerm accepted this revision.Jul 2 2021, 6:23 PM

Nice refactoring!

mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
689–691

can be deleted

This revision is now accepted and ready to land.Jul 2 2021, 6:23 PM
cathyzhyi updated this revision to Diff 356314.Jul 2 2021, 7:38 PM

delete redundant comments as suggested

cathyzhyi marked an inline comment as done.Jul 2 2021, 7:42 PM

This has reached consensus, I suspect @cathyzhyi does not yet have write privilege to github, committing on her behalf.

This revision was automatically updated to reflect the committed changes.