The patch introduces a canonicalization pattern for parallel loops. The pattern removes single-iteration loop dimensions if the loop bounds and steps are constants.
Details
Diff Detail
Event Timeline
Thank you for splitting this out! Just some nits but otherwise this looks great! Thanks for adding this.
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
728 | nit: You can also use op.getInductionVars()which makes the intent clearer. | |
mlir/test/Dialect/SCF/canonicalize.mlir | ||
12 | It would be great to add the case of an empty iteration space. Either with another copy that does not get rewritten or by introducing one in this example. |
mlir/test/Dialect/SCF/canonicalize.mlir | ||
---|---|---|
12 | Does it make sense to have a canonicalization pattern that remove loops with an empty iteration space entirely. I was circumventing this bcs I was not sure if I can simply replace all uses of the loop result values by the init values. And to be honest, I also do not believe it is a very relevant "optimization". |
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
755 | It isn't clear to me that you need a mapping here, instead of mapping.map(std::get<3>(bounds), std::get<0>(bounds)); couldn't you just do std::get<3>(bounds).replaceAllUsesWith(std::get<0>(bounds)) and inline the region? Alternatively, we should extend the inlineRegion methods with one that fits the needs here. |
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
755 | My understanding (from an earlier review) is that it is not a good idea to use replaceAllUsesWith with the PatternRewriter since it cannot track these changes (which is needed to rollback?). But I may be wrong there and things may have changed since this discussion. Let me know if there is a misunderstanding on my side. Having an inline signature with a mapping sounds like a useful extension to me! |
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
755 | I learned canonicalization patterns are never reverted on success. I thus assume I can use replaceAllUsesWith but only after line 738 where the pattern may return failure()? Unfortunately I also need to erase all remapped arguments. The following code works for me: for (auto iv : llvm::enumerate(llvm::reverse(op.getInductionVars()))) { if (auto lowerBound = mapping.lookupOrNull(iv.value())) { iv.value().replaceAllUsesWith(lowerBound); op.getBody()->eraseArgument(op.getNumLoops() - iv.index() - 1); } } rewriter.inlineRegionBefore(op.region(), newOp.region(), newOp.region().begin()); If I do not erase the remapped arguments I get an invalid loop with the original number of induction variables / block arguments. Let me know if there is a nicer way to do this and if I should update the patch. |
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
755 | I'd say they are _currently_ never reverted on success if you run the canonicalization pass. If somebody decides to put this pattern into the legalization infra (e.g., applyAnalysisConversion) which can use these patterns just fine, they can be reverted... That being said, if you use eraseArgument, you can use replaceAllUsesWith and the rest of in-place modifications because none of those would be compatible with the rollback... @rriddle will be the most knowledgeable here |
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
755 | I would not rely on the fact that they aren't currently reverted. All patterns generally have the same restrictions irregardless of the driver, i.e. you should not really be mutating outside of the pattern rewriter. If you code patterns, especially canonicalization patterns, to the quirks of a specific driver it breaks the ability to use it anywhere else and it also makes it much much more difficult if anyone ever wants to change the way the driver works in the future. |
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
755 | That's interesting: limiting canonicalization to "cancellable" transformations may incur extra costs (as we see here): do we have an RAUW exposed in a way that the driver could change the behavior to have efficient canonicalizations? (there may be tricky interactions though, I'm not sure how it'd play out) |
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
755 | It's not just about cancellable or not, even the greedy driver keeps a mapping of things which can easily become invalidated. There have been many instances of crashes/asan failures because a pattern is doing something without informing the driver. For dialect conversion it is largely about cancellability, for greedy it is largely about allowing the driver to remove mappings for things that are being invalidated. |
- Added a comment that reflects the discussions about clone (can be reverted) vs inline (performance).
- The current implementation is revertible at the cost of cloning the loop body.
- Use std::tie to improve the code readability.
I think the conclusion is that we want it to be cancelable. So this is ready to land.
mlir/test/Dialect/SCF/canonicalize.mlir | ||
---|---|---|
12 | I agree it is not very relevant but yes, you could replace the results with the init values in such case. |
I don't think most of these are necessary.