This tensor.pad can be represented as an initialization of the result
with the padding values followed by a strided insert of the input into
the output. To honor enforced invariants of bufferization, the pad
operation creates bufferized forms of both of these steps at the same
time.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp | ||
---|---|---|
746 | nit: loc? | |
754–760 | what about making this function a part of extraClassDeclaration for PadOp? | |
760 | cache results of getMixedLowPad and getMixedHighPad since they construct a new vector? | |
771 | super-nit: "maybeResult" and "result" are a bit confusing here. What about just calling it allocOr and alloc? | |
783 | upperBounds.reserve(rank)? | |
785 | nit: for (int i = 0, nextDynamicIndex = 0; i < rank; i++)? | |
873 | nit: remove empty line? |
mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp | ||
---|---|---|
769–770 | See comment below. It would be cleaner to query the loop bounds with ReifyRankedShapedTypeOpInterface. Can be done in separate commit or add TODO. | |
mlir/lib/Dialect/Tensor/Transforms/Bufferize.cpp | ||
43 | Why is this needed? | |
mlir/test/Dialect/Tensor/bufferize.mlir | ||
379–382 | Try this with a test case that has a dynamic result dim size. I think the bufferization will fail. PadOp should implement ReifyRankedShapedTypeOpInterface, which it does not currently. Once it does, the dynamic bufferization test case should pass. (Can be done in a separate commit.) |
mlir/test/Dialect/Tensor/bufferize.mlir | ||
---|---|---|
395–399 | As a general comment, I'm not sure why scf.parallel's are being generated here instead of affine.parallel + affine.store -- these are trivially affine. And then if you do scf -> affine on this, you'll get this scf output exactly. So you have a strictly better path lowering this through affine. I notice there are other lowerings that have chosen this and that's just because they were missed during review propagating the counter-pattern. CC: @ftynse @mehdi_amini |
mlir/test/Dialect/Tensor/bufferize.mlir | ||
---|---|---|
395–399 | Correction: "scf -> affine" should "affine -> scf" above. |
mlir/test/Dialect/Tensor/bufferize.mlir | ||
---|---|---|
395–399 | E.g., tensor.generate also lowers to scf.parallel. I think there's no particular reason why it was implemented like that. When I touched that bufferization pattern the last time, I just did not think of lowering to affine, then to SCF. |
mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp | ||
---|---|---|
769–770 | Just noticed that the op already implements the interface via an external model: mlir::tensor::registerInferTypeOpInterfaceExternalModels. |
mlir/test/Dialect/Tensor/bufferize.mlir | ||
---|---|---|
395–399 | tensor -> affine -> scf would be natural lowering and de-abstraction here. Lowering directly from tensor -> scf, whenever tensor -> affine is possible, would just be the wrong direction and strictly worse in all possible ways. We shouldn't propagate this further and fix the rest. They were just missed during the review. |
To follow on Uday's comment, this should not generate loops at all.
In linalg, you'd generate linalg.fill + subview + linalg.copy and later re-use the lowering to any of the loop flavor you want.
Seems like you are missing a memref.fill or memref.memset abstraction.
Sorry to delay on this for so long. I am proposing, instead, a TensorToLinalgPass here: https://reviews.llvm.org/D125384
I believe that bufferization should try to stay simple: tensor ops converted to memref ops in the case of accesses and other operations converting their types from tensors to memrefs.
For lowerings like tensor.pad which involve other control flow as discussed here, I believe that should not be considered as part of bufferization but instead should be a more principled lowering at another point in time. This separates the concerns then of tensor operations that are closely tied to the type tensor and operations that are there for convenience but would be just as relevant somewhere like linalg.
nit: loc?