loop.coalesce_parallel enable the coalescing of arbitrary index variables in an scf.forall. This is done by calculating the old index variables based on the new index variables.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td | ||
---|---|---|
221 | Can't we rather extend loop.coalesce to dispatch to one of the two functions based on the kind of op it sees? | |
226 | Please provide more documentation tha one line. | |
234–236 | What are these? Please document. It's unclear to me why do we need to hardcode exactly three of these. | |
mlir/test/Dialect/SCF/transform-op-coalesce.mlir | ||
99 | Coalescing normally requires two nested loops. I suppose this attempts to collapse iterators of a single op, but this is not documented, and not really exercised in the test, which doesn't test for the trip count of the new dimensions or for how the access indices are updated. |
Updated the op documentation and addressed the comments.
mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td | ||
---|---|---|
221 | Since we need additional collapsed dimensions input specifying the position of the newly coalesced dim. Also, I added another input for the mapping attribute which is not necessary in the case of scf.for loop. | |
226 | Added | |
234–236 | collapsed_dim0 means that whatever is specified in that array will be dim0 in new scf.forall. I added an example in op document. | |
mlir/test/Dialect/SCF/transform-op-coalesce.mlir | ||
99 | Could you please help me understand your comment better? I apologize if I didn't fully comprehend what you were trying to convey. |
Apologies for the delay. Some more comments, but we are getting there.
mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td | ||
---|---|---|
221 | Okay, then let's at least rename this to coalesce_forall because "scf.parallel" is a different construct that is not being handled here. | |
226 | Nit: let's make sure it fits into 80 cols. | |
227 | Nit: numbner/number | |
232 | This can use markdown-style three backticks to delimit the code here (it will also render properly on the website). | |
234–236 |
We can have an ArrayAttr of DenseArrayAttr (or any other attribute as a matter of fact). Let's do this instead. Hardcoding assumptions about hardware isn't desirable at this level. | |
mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp | ||
257 | Ultra-nit: no need to prefix SmallVector` with llvm::, it is re-exported in the mlir namespace. Also, you may want to consider RaggedArray from mlir/Dialect/Transform/Utils/. | |
259 | Nit: please expand auto unless the type is obvious from context (e.g., there's a cast on the RHS). | |
279 | It would be nice to add a diagnostic note pointing to the specific loop that failed to coalesce. | |
mlir/lib/Dialect/SCF/Utils/Utils.cpp | ||
713–714 | Let's rather make this function accept a RewriterBase so we can connect that properly from elsewhere in the codebase. | |
797–799 | If this uses RewriterBase, all of these becomes much simpler and cheaper. The body block of the original loop can be inlined, which also handles the update of block arguments. Using RAUW makes this function incompatible with rewrites and generally difficult to debug. | |
mlir/test/Dialect/SCF/transform-op-coalesce.mlir | ||
99 | Now that there's more documentation, I suppose the only missing thing here is the check that the new loop iterates in (1824, 3648). | |
126 | Nit: please add the newline. |
Can't we rather extend loop.coalesce to dispatch to one of the two functions based on the kind of op it sees?