Add a pass that outlines clusters of operations (conv2d or similar plus adjacent elementwise and constant) into kernel functions, for later separate optimisation.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Decisions on what can be fused are often very hardware specific. I do like that the partitioning is parameterized, so that if I'm understanding properly, any set of ops can be defined as anchor as well as leading/trailing ops to be captured. Is a new function the right destination for these? Have you looked at using the ml_program dialect to capture this as a region? I would imagine the overall structure wouldn't need to change significantly. It's at least an option worth considering.
mlir/test/Dialect/Tosa/tosa-partition-run.mlir | ||
---|---|---|
1 | These seem more appropriate for test/Integration/Dialect/Tosa (no such directory currently, but is similar to other tests in test/Integration) |
We've recast the partitioner as a generic utility with a Tosa pass (and in the future other passes) as client. That will be a future revision. I'll keep the suggestions for dialect/region and test location in mind.