For most Linalg operations, the shape of the output can be computed
from the shape.values of the inputs. This change enhance the
InferShapedTypeOpInterface to allow moving all Linalg Structured
operations as well as some other Linalg ops to use this
interface. Adding a canonicalization to dim ops that are getting the
dim of the result of an op that implements this interface allows
removing many patterns across these ops that were implementing the
same functionality.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Nice! So then if i plug the shape maps through named ops, I can also implement the interface, right?
mlir/include/mlir/Interfaces/InferTypeOpInterface.td | ||
---|---|---|
165 | Since this is generic, do you also want to early exit if there are no results? |
@nicolasvasilache and @stellaraccident . I guess I kind of cheated here. Right now this assumes that all the Linalg Structured ops use the default of mapping shapes -> loops and loops -> shapes, so you can resolve the output shapes based on input shapes using the affine maps. The correct way to this would be to have a separate Interface method on the LinalgOp that provides a default implementation. Each operation for which the default affine map based shape computation is not valid will then have to overide this default implementation of the linalg ops. I think its a change that is possible to do right now. Will make that change. Then the python front end could do nothing and named ops will pick up the standard method. Or they can override the default interface on the LinalgOp to provide their own shape transfer function.
Let me know if you see any gotchas here.
mlir/include/mlir/Interfaces/InferTypeOpInterface.td | ||
---|---|---|
109–157 | Does this result correspond to operand of op as defined for the op or as part of "flattened"? | |
115 | dim isn't documented. Is this saying that if you have a (tensor<10x20xf32>) you can specify 0, 1 to get constant 10? It would seem you get exactly this by reify & get_element_at & canonicalizing. How many of these custom helpers are generated by creating all these methods vs composition? What is the impact on codesize? These seem very pro cases where the dimension computation is very separate and so this is cheap, I think that is a special case. | |
118 | By your definition, the below is only valid if there is a single result. | |
157 | Why this restriction? This is returning an array of results in particular because each of them is for a different value. | |
168 | Again this is making the assumption that computing each dim indivdualy is cheaper than the whole, so just compute all and make the base case the per index and dim one. |
Thanks for doing this Mahesh! This will really help unblock me.
mlir/lib/Dialect/StandardOps/IR/Ops.cpp | ||
---|---|---|
1570–1571 | this should probably be dyn_cast_or_null, in the case where dim is used on a block/function argument (who have null owners) |
mlir/include/mlir/Interfaces/InferTypeOpInterface.td | ||
---|---|---|
122 | Remove the special case for one result. It's not hard to just implement the more general case (just add assert(resultIndex == 0) at the top. |
After an offline conversation with Jacques, I realize I had misread the existing interface. I havent changed anything in this patch, but a follow up patch (https://reviews.llvm.org/D97887) goes back to the previous interface and each op that uses this interface needs to use the reifyReturnTypeShapes to compute the shape of all results. PTAL and let me know which works better. I prefer the one in the patch (https://reviews.llvm.org/D97887) cause it is less ambiguous.
mlir/include/mlir/Interfaces/InferTypeOpInterface.td | ||
---|---|---|
109–157 | This was meant to be the result index if the op returns multiple indices. | |
115 | See followup patch that addresses this concern to some extent. W.R.T codesize, AFAIK, the methods are just default implementations that are on the interface (and every op that needs to override it). So it shouldnt impact codesize, though there is some vtable jumping that will happen on invoking this method. | |
118 | Yeah, the documentation needs to be changed. Will take a pass at it based on comments with follow up patch (this method might not even be needed) | |
122 | Its useful to have special case for resultIndex == 0. See the canonicalization on DimOp. I dont need to call different interface methods depending on number of results of the operation. Overall though, I found this interface is maybe a bit clunky. The follow up patch goes back to the old interface and each op is expected to return the shape of all its results. That works well in practice with canonicalizations removing all unnecessary code. | |
157 | I misread the interface as I mentioned to you. See follow up patch. | |
165 | Sure, I will add that, but please do look at the follow up patch. This issue will not exists on that interface. | |
168 | See followup patch that avoids this. |
dim isn't documented. Is this saying that if you have a (tensor<10x20xf32>) you can specify 0, 1 to get constant 10?
It would seem you get exactly this by reify & get_element_at & canonicalizing. How many of these custom helpers are generated by creating all these methods vs composition? What is the impact on codesize? These seem very pro cases where the dimension computation is very separate and so this is cheap, I think that is a special case.