This adds a transformation to tile reduction operations to partial
reduction using scf.foreachthread. This uses
PartialReductionOpInterface to create a merge operation of the partial
tiles.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Generally looks good, but have we tried refactoring enough?
mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td | ||
---|---|---|
726 | number number ? | |
727 | nit: period + newline to better align the sentences. | |
mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h | ||
456 | Can we make this singular? There should only be one such (multi-)loop. | |
mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp | ||
1189 | Very important, on silenceable failure and success paths please always set all results or it will be hard to debug in Suppress error mode. E.g. a bit higher in this file is an example: // If nothing to fuse, propagate success. if (producerOps.empty()) { results.set(getFusedOp().cast<OpResult>(), SmallVector<mlir::Operation *>{}); return DiagnosedSilenceableFailure::success(); } | |
1190 | Please see https://reviews.llvm.org/D136072 (and save it somewhere), we want to use emitSilenceableFailure + twine as much as possible everywhere from now on. | |
mlir/lib/Dialect/Linalg/Transforms/Tiling.cpp | ||
414 | This seems very long given that we already have mlir::scf::tileReductionUsingScf (in a separate file) and tileToForeachThreadOpImpl (which this impl does not reuse). Have you tried reusing enough ? | |
435 | don't reimplement linalgOp. getReductionDims (ish) | |
475 | can this be lifted and become a notifyMatchFailure rather than a hard error that leaves the IR in a modified (an invalid?) state? |
mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp | ||
---|---|---|
1189 | Ok thanks for pointing that out, I believe it should match the right flow now. | |
mlir/lib/Dialect/Linalg/Transforms/Tiling.cpp | ||
414 | Yes I couldn't find a way to isolate more helpers besides calculateTileOffsetsAndSizes as the rest of the processing is slightly different than what is in tileToForeachThreadOpImpl |
Skimmed through. +1 for using getTiledImplementation! I can review in more detail if needed....
number number ?