The current BufferPlacement implementation does not support nested region control flow. This CL adds support for nested regions via the RegionBranchOpInterface and the detection of branch-like (ReturnLike) terminators inside nested regions.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Can you also add a test with deeper nesting?
mlir/lib/Transforms/BufferPlacement.cpp | ||
---|---|---|
72 | Another way to describe exitParentRegion would be to only consider immediately nested regions. You can use the approach you have now, or simply use a loop over regions/blocks and get their terminators in the true case. Less traversal work. | |
78 | would terminator->getParentRegion()->getParentOp() == operation do the same? | |
148 | detailed information is not very helpful in a comment. What does it query? | |
162 | Maybe llvm::for_each? Drop the {,}. | |
176 | So these are the entry regions for this operation. Maybe write that in the comment. | |
180 | This ties the operands of the op with a region to the block arguments of the target region, | |
181 | Block &successorBlock = regionSuccessor.getSuccessor()->front(); ? | |
182 | for_each or drop the brackets. | |
186 | Now that you have queried the flow-in, you can also query the flow within the op. For this, you can call regionInterface.getSuccessorRegions once for each region of the op. The alias registration is the same as with the initial flow into the op. Except, when regionSuccessor.getSuccessor() is nulltptr. That signals that the terminator of the region will exit the region. So you have to tie the regionSuccessor.getSuccessorInputs() to parentOp->getResults() in that case. | |
190 | /*exitParentRegion=*/false | |
190 | I don't think this is needed with the above implemented. | |
294 | front()? | |
412 | This copies a region result, so maybe reflect that in the name. | |
420 | This also needs to use the RegionBranchOpInterface to find all the terminators in the operation and their successor inputs (they might be in a different order than the op results). Querying the terminators directly makes assumptions about ordering that do not exist. | |
mlir/test/Transforms/buffer-placement.mlir | ||
779 | There should be a dealloc here, right? |
Added extended support for the RegionOpInterface to query successor bindings for successor regions
mlir/lib/Transforms/BufferPlacement.cpp | ||
---|---|---|
72 | Please move static functions to the global namespace. anonymous namespace should only really be used by things like classes. | |
151 | nit: Use a lambda instead. | |
166 | nit: I don't think this really saves anything over a normal for loop, if anything it is much less efficient. | |
175 | nit: Use the full name for the type. | |
200 | nit: Use parameter names when passing constants, i.e., /*someName=*/ | |
403 | You are completely disregarding the fact that getMutableSuccessorOperands can return None, which would cause this to crash. Where are you checking that this is valid? | |
435 | Is this guaranteed to be a RegionBranchOpInterface? | |
453 | llvm::find? |
mlir/lib/Transforms/BufferPlacement.cpp | ||
---|---|---|
175 | Why is this called operands? These are the results of the overall operation, right? | |
177 | This region does not have a valid successor block if it terminates the parent operation. In that case, we wire its successor operands with the parent operations results. This could be more obvious in the code if you would pass the parent operation. Please at least rename operands to results. | |
182 | The loop and the zip could be part of the helper function registerAliasFunc. | |
187 | Isn't this the wrong way round? The results alias the successor inputs? | |
213 | This only follows one level of branching. So if the entry region goes to region 2, and that one goes to region 3, the second link will not be seen. It should be good enough to loop over all regions of an operation and then do this linking to successor regions. | |
403 | The underlying assumption is that if the successor has block arguments, then the branch in the predecessor needs to have operands for those. Can we rely on this? |
mlir/lib/Transforms/BufferPlacement.cpp | ||
---|---|---|
403 | Branching operations are not required to have a Value already materialized for a block argument. There are certain classes of operations that internally generate the value that is passed to the branch. For example, operations like LLVM Callbr and certain SIL switches. |
mlir/lib/Transforms/BufferPlacement.cpp | ||
---|---|---|
403 | We have changed the code to emit an error message. We prefer the error message over silently ignoring this case as the retrieved alias information can be invalid if an operation passes values implicitly to a block argument. | |
435 | The value should either be a BlockArgument or a value resulting from a RegionBranchOpInterface operation. |
mlir/lib/Transforms/BufferPlacement.cpp | ||
---|---|---|
162 | nit: interface | |
189 | entryRegion.getSuccessorInputs returns the inputs of the target region, typically the block arguments of the first block. So this maps them with itself. | |
205 | The terminator should also implement the BranchOpInterface, right? So, one should query the inputs to the successor using BranchOpInterface.getSuccessorOperands, correct? Here, successorRegion.getSuccessorInputs returns the input values to the region, which normally would be the block arguments or, in case this leaves the operation, the results. |
Added support for region-region control flow within operations that implement the RegionBranchOpInterface.
Added new test operations to verify more advanced region-region control-flow scenarios.
Cool stuff. Great to see this work with complex region interfaces, as well!
So, this now assumes that there is a 1:1 mapping between return-like op operands and successor inputs and we likely want an interface that makes this configurable. Maybe the returnlike trait could provide a function that enables this. (We discussed this, just writing it down.)
Also, we need a clean-up pass to remove unneeded alloc+copy+dealloc triples. This is underway.
With the nits addressed and some of the comments cleaned up, this is good to go.
mlir/lib/Transforms/BufferPlacement.cpp | ||
---|---|---|
190 | Mega-nit: entryRegion->getSuccessor reads weird. It read like getting the successor of the entry region but actually it gets the entry region itself :) Maybe rename this entrySuccessor? | |
194 | Maybe Wire flow between regions and from region exits.? | |
200 | The length of operandAttributes might be wrong here, as it was built for the entry successors. | |
202 | I think this comment is now off. It always wires the terminator operands with the successor inputs. The latter can be block arguments of a region's entry block or the result values, if the terminator exists the op. | |
293 | Should this apply to the case above, as well? | |
394 | Where is this conversion? Or is this comment off? | |
399 | Nit: Doesn't this create a new alloc and a copy? If so, please fix the comment. | |
446 | So is it OK to pass an empty ArrayRef here? If so, why can we not do this in the other cases? | |
mlir/test/lib/Dialect/Test/TestOps.td | ||
1344 ↗ | (On Diff #273290) | Please use let description = [{ ... }]; instead of comment. |
Addressed reviewer comments.
mlir/lib/Transforms/BufferPlacement.cpp | ||
---|---|---|
189 | Yeah makes sense to me; we should wire getSuccessorEntryOperands with getSuccessorInputs. | |
200 | According to our interpretation of the comment
this should be an array containing values for each operand of the defining operation; although it might make sense to include an attribute for each block argument. However, we should keep this for now, as it is fully compatible with the current implementation of all instantiations of the RegionBrachOpInterface. | |
205 | I guess querying a BranchOpInterface might not make sense in the case of a ReturnLike op, since it does not branch to any block in general (although this can still be expressed via the BranchOpInterface). It feels like that a ReturnLike terminator should provide a method to access all operands similar to the BranchOpInterface. |
Another way to describe exitParentRegion would be to only consider immediately nested regions. You can use the approach you have now, or simply use a loop over regions/blocks and get their terminators in the true case. Less traversal work.