Details
- Reviewers
ftynse - Commits
- rG4184018253e7: [mlir][SCF] Canonicalize nested ParallelOp's
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
1736 | Could you rather use rewriter.mergeBlocks here? It lets you remap values and doesn't require copying the operation. Not necessarily in the bodyBuilder, just create a new ParallelOp and merge the body of the inner loop into that of newly created loop, then erase the duplicate terminator. |
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
1736 | I didn't quite get how to do this. If we create create new outer ParallelOp and try to merge block from old inner ParallelOp into it, this new block will still reference iterArgs from old outer ParallelOp (we cannot remap them via mergeBlocks api). |
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
1736 | We can remap, that's what the third argument is for in https://github.com/llvm/llvm-project/blob/182162b61629f039e7aafc3f7eaab9cc64a81c83/mlir/include/mlir/IR/PatternMatch.h#L734. |
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
1736 | My bad, you mean the outer loop. |
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
1736 | It can remap only source block arguments as I understand but iter vars from outer loop are no inner block arguments, they are just captured in inner block. |
Anyway, there is another issue, I should also add check that outer block iter vars not being used as inner loop bounds/steps
What was the motivation behind this canonicalization? In particular, why would a combined parallel loop be considered the canonical form?
We found this pattern because it undoes the tiling we do in some cases. I agree the two forms are semantically equivalent, so the transformation is correct. It is inconvenient for our use case, though.
Isn't the whole point of canonicalization to convert semantically equivalent codes to the single form? So we should either always merge nested loops or always split them (which makes nested loops support in scf.parallel useless).
I agree with this. As a single scf.parallel op can represent multi-dimensional nests, the canonical form would be one where successive (1-d or n-d) scf.parallel ops are combined into a larger n-d scf.parallel op. However, a few issues with this revision:
- It's missing a commit summary; please don't leave it empty in such cases.
- Test cases are missing nests where one or more of the input scf.parallel were already multi-dimensional. This is important to exercise.
- The rewrite pattern is missing a doc comment and code comments at places.
mlir/lib/Dialect/SCF/SCF.cpp | ||
---|---|---|
1722 | Can you just use llvm::is_contained? (In that case, this lambda isn't adding anything.) |
Can you just use llvm::is_contained? (In that case, this lambda isn't adding anything.)