This transformation is useful to break dependency between consecutive loop iterations by increasing the size of a temporary buffer.
This is usually combined with heavy software pipelining.
Started a discourse thread in case we need this needs further discussions: https://discourse.llvm.org/t/loop-double-buffering-multi-buffering/59979
Details
Diff Detail
Event Timeline
mlir/lib/Dialect/MemRef/Transforms/MultiBuffering.cpp | ||
---|---|---|
68–70 ↗ | (On Diff #407388) | This should in general have nothing to do with an scf::for op -- you aren't transforming the loop in any way. Just use LoopLikeOpInterface. |
99–104 ↗ | (On Diff #407388) | Avoid generating lower-level arithmetic ops. Instead, just generate a modulo using affine.apply. |
mlir/lib/Dialect/MemRef/Transforms/MultiBuffering.cpp | ||
---|---|---|
68–70 ↗ | (On Diff #407388) | I need to access the single induction variable as well as the lower bound and step. This is not possible with LoopLikeInterface as it cannot be applied to any loops. |
99–104 ↗ | (On Diff #407388) | Good point, I didn't realize this was a valid affine map. Changed it. |
mlir/include/mlir/Dialect/MemRef/Transforms/Passes.h | ||
---|---|---|
77 | Please document the return value. | |
77 | This is a utility and should be returning LogicalResult instead of bool I think. | |
mlir/lib/Dialect/MemRef/Transforms/MultiBuffering.cpp | ||
65 ↗ | (On Diff #407796) | Please document the return value. |
66 ↗ | (On Diff #407796) | You don't need a triple doc comment here. Just the more detailed version on the declaration will suffice. |
68–70 ↗ | (On Diff #407388) | This should be pretty straightforward: you can just have the op interface method returning the IV (say getInductionVar()) return None for multi-dim loops (like for scf.parallel) and have this method fail in such a case just like you already do at line 87. It's going to be as simple and strictly more general. For the other two things you mention (lower bound and step), that's straightforward as well but the interface method will have to materialize some trivial IR (for eg. when it's a constant as in the case of affine.for): // For IV.
Optional<Value> getOrCreateLowerBound(OpBuilder &b);
Optional<Value> getOrCreateStep(OpBuilder &b); With the above, it would be strictly more general than restricting it to scf::ForOp and as simple. And in your case of interest here, you'll get output IR identical to what you have in the test cases below. |
mlir/test/Dialect/MemRef/multibuffering.mlir | ||
67–68 ↗ | (On Diff #407796) | You don't need the %{{.*}} = part. |
mlir/include/mlir/Dialect/MemRef/Transforms/Passes.h | ||
---|---|---|
77 | The utilities are almost always named imperatively instead of a gerund. multiBuffering -> multiBuffer? |
I switched to using loop interface and added a test for affine.for op. Please take another look.
mlir/lib/Dialect/MemRef/Transforms/MultiBuffering.cpp | ||
---|---|---|
68–70 ↗ | (On Diff #407388) | I added the interface and moved the code to use it. This means it now works on affine.for op. I added a test for it. |
The comments that I had have been addressed. Dismissing change request.
mlir/lib/Dialect/MemRef/Transforms/MultiBuffer.cpp | ||
---|---|---|
23 | Define using dyn_cast to avoid a double check in assert mode. |
mlir/lib/Dialect/MemRef/Transforms/MultiBuffer.cpp | ||
---|---|---|
29 | You are also collecting and erasing ops here. | |
36–53 | Invert the conditional? if (!....) { push_back continue; } ... | |
70 | Unchecked use of dyn_cast result. This should just be a single cast? | |
130–131 | strides: append -> assign (since this is empty at this stage). |
mlir/lib/Dialect/MemRef/Transforms/MultiBuffer.cpp | ||
---|---|---|
70 | OpFoldResult/PointerUnion don't have cast they only have dyn_cast, I see it used like that in the rest of the code |
LGTM!
mlir/lib/Dialect/MemRef/Transforms/MultiBuffer.cpp | ||
---|---|---|
89 | Nit: if this user doesnt dominate... |
Please document the return value.