Transpose operations on constant data were getting folded during the
canonicalization process. This has compile time cost proportional to
the constant size. Moving this to a separate pass to enable optionality
and flexibility of how such scenarios can be handled.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/lib/Dialect/Tosa/Transforms/TosaConstantFoldPass.cpp | ||
---|---|---|
33 | You will want to include the canonicalization patterns as well. It is valuable to run the ConstantFoldPass separately however when we do finally apply it we will want it alternating with canoncalizations so guarantee the patterns run to completeness. | |
mlir/lib/Dialect/Tosa/Transforms/TosaFoldConstantTranspose.cpp | ||
92 | Add newline. | |
mlir/test/Dialect/Tosa/canonicalize.mlir | ||
411 | Add newline back. | |
mlir/test/Dialect/Tosa/consant-op-fold.mlir | ||
118 ↗ | (On Diff #426107) | add new line back |
mlir/lib/Dialect/Tosa/Transforms/TosaConstantFoldPass.cpp | ||
---|---|---|
33 | are you suggesting @rsuderman to register TOSA related canonicalization patterns as well? |
Second observation - can you include this pass after the canonicalizer when lowering to linalg?
mlir/lib/Dialect/Tosa/Transforms/TosaConstantFoldPass.cpp | ||
---|---|---|
33 | Yes I believe so. I'll breakdown the logic - it is reasonably common to see the pattern transpose-reshape-transpose. If we only run the transpose constant propagation then the first transpose will be constant folded. The next reshape will be canonicalized however the second transpose will never have an opportunity to constant propagate. It sounds silly but this may actually significant affect codegen performance. |
Good point about potentially expensive operations here, I wonder: why is/was this a canonicalization pattern rather than fold implementation?
mlir/lib/Dialect/Tosa/Transforms/TosaConstantFoldPass.cpp | ||
---|---|---|
33 | https://github.com/llvm/llvm-project/blob/12e41d9264b6f84213be86aab75016fb82ebc1d1/mlir/lib/Dialect/Tosa/IR/TosaOps.cpp#L78 like expansion may help ease adding all the canonicalization patterns in here without needing to duplicate. | |
mlir/lib/Dialect/Tosa/Transforms/TosaFoldConstantTranspose.cpp | ||
30 | I'd move this closer to first use: in general the preference is as closely scoped/as small live range as possible. |
Not sure @jpienaar why this was registered as a canonicalization originally. Can comment though that have noticed this being quite expensive. Moreover, a similar folding step takes place at Linalg level from what I recall.
This folding operation predates the linalg folding by ~6 months. It is likely we want to maintain the TOSA folding anyways as not every TOSA device is guaranteed to use Linalg.
LGTM (and yay for usage of LITERAL ;-))
mlir/include/mlir/Dialect/Tosa/Transforms/Passes.td | ||
---|---|---|
21 | Is element wise a restriction of this pass? |
mlir/include/mlir/Dialect/Tosa/Transforms/Passes.td | ||
---|---|---|
21 | As it stands now it doesn't really have any restrictions, was trying ahead-of-time to limit its scope. Happy to rephrase or leave as is and change its scope if needed in the future. |
Can you clarify the revision description? You refer to some "unbound computation" which isn't clear to me. I think the whole motivation and rationale could be better spelled out.
(also remove the Change-Id: I7ec0f8b15ca6bc9aa2116488dcf6c684c9826ddd part which seems a leftover of some other system?)
Hey @mehdi_amini thanks for your comments. Simplified a bit the commit message and removed the Change-Id part of it.
Generally LGTM, although we may want to further constrain this in a follow-up. In general, I am pro having optimization passes and having them be truly optional. I would be tempted to name this "TensorDataConstantOptimizationPass".
mlir/lib/Conversion/TosaToLinalg/TosaToLinalgPass.cpp | ||
---|---|---|
79 | I am fairly certain that we will want to remove this -- optimizations like this are orthogonal to actually lowering to linalg. It is fairly important that optimization passes be separate from lowerings. This qualifies to me as an optimization because it has a cost model (ie. Only apply if has single use) which we will likely want try vary. |
Will make the change and rename per suggestion @stellaraccident
mlir/lib/Conversion/TosaToLinalg/TosaToLinalgPass.cpp | ||
---|---|---|
79 | Agree with you. In my original patch this wasn't registered in the TosaToLinalg conversion pipeline. Was added back after a suggestion from @rsuderman. Happy to remove. |
mlir/lib/Conversion/TosaToLinalg/TosaToLinalgPass.cpp | ||
---|---|---|
79 | It looks like this pipeline has some potentially phase ordering concerns, so I'll defer to rsuderman in terms of next step. If this is a temporary step to preserve current behavior, it is fine with me to keep it (to be removed in a future cleanup), but if that is the case, let's add a todo. |
mlir/lib/Conversion/TosaToLinalg/TosaToLinalgPass.cpp | ||
---|---|---|
79 | We should be good having it inlined here. If we see a regression related to convolutional models we may have to adjust the constant folding. Overall it should still be fine with you linalg constant folding but this should be good enough now. |
Is element wise a restriction of this pass?