The affine.execute_region op executes its region exactly once while
defining a new polyhedral scope for its region for analysis and
transformation purposes, i.e., a new symbol context is defined for
operations appearing in its region. This allows the polyhedral form to
be used in a wider context without the need for function outlining.
The op explicitly captures only memrefs and is lowered readily to an
std.execute_region.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/docs/Dialects/Affine.md | ||
---|---|---|
612 ↗ | (On Diff #236236) | I don’t think you addressed my concerns on this topic? |
mlir/docs/Dialects/Affine.md | ||
---|---|---|
612 ↗ | (On Diff #236236) | I think I responded to everything and included all of the arguments in the RFC: |
mlir/docs/Dialects/Affine.md | ||
---|---|---|
612 ↗ | (On Diff #236236) | I explained my concerns in the original thread https://groups.google.com/a/tensorflow.org/d/msg/mlir/O5PXVbtlSng/3SXmxDiLAAAJ Here is what I wrote:
I don't think you answered these points that explain why I am not convinced it is OK to have explicit capture just for memref. You answered the email above in the thread, but you didn't address it I believe, you wrote "I'll respond to the connection to the affine.graybox proposal in another post" but I didn't see another post after that. (I'm on vacation till 1/10, expect some delays in answers) |
mlir/docs/Dialects/Affine.md | ||
---|---|---|
612 ↗ | (On Diff #236236) | The downsides of not explicitly capturing memrefs on the graybox are discussed in the RFC here. Another way to think about this is that: because grayboxes introduce a new symbol context, most polyhedral walks would like to conceptually see the graybox just as a "call" with those memrefs passed to start with. Other arbitrary/unknown ops with regions don't start a symbol context and so the affine walks will just walk through such ops (just like they walk through affine.fors/ifs that are encountered). OTOH, walkAffine will not walk through grayboxes from the top. If you don't explicitly capture, the key is that most polyhedral/affine passes will have to stop/check every op for a graybox and then scan the interior of that op for memrefs if it turns out to be a graybox. (For the future, this would even go against multithreading polyhedral passes to run concurrently on different func's and grayboxes in an isolated way, but I'm not bringing this up now in the RFC). With an explicit capture, things would just be a regular operand scan of ops scanned with walkAffine. For the future cases where you really need more precise information on what's happening to the memrefs inside the graybox (just like you may want to in the case of call's), that can be done as needed. Non-memref values on the other hand just move transparently across the boundaries of grayboxes in the regular SSA passes/canonicalizations or hybrid polyhedral/SSA ones. My concerns with explicit capture are actually very different from yours: that they make it harder to move IR across without actually updating the memrefs being used (either hoist from inside to outside or sink). You'd have to check if you are moving past a graybox and then remap memrefs (consider scalar replacement on affine load/stores, LICM as examples). But I still strongly feel that explicit capture of the memrefs is the right tradeoff to start with (even if perhaps not the right one eventually) - we can reevaluate its impact and drop if necessary. Best, |
mlir/docs/Dialects/Affine.md | ||
---|---|---|
612 ↗ | (On Diff #236236) | And finally reg. your points on unknown ops with regions that explicitly capture in addition to perhaps implicitly: that's an issue separate from affine.graybox and that exists in the codebase as is now. If you have an op whose operands bind to its region's arguments in ways that are unknown, the current affine passes/utilities don't check for example if the memrefs inside are shadows, how memory footprints / dependences should be accurate computed, etc. (in fact, they are already incorrect because they'd walk through that op and treat those memrefs as distinct). The solution to this would depend on the pass/utility: some should just do nothing in the presence of unknown ops that have one or more regions (we shouldn't even say 'explicit capture' here because we don't even know if it's really capturing in spite of having operands). Since that same op could also implicitly capture (as you mention in one case), choosing to not look inside may not be an option depending on the pass (unless of course the op is known to be isolated from above). But how is all this related to explicit capture in affine.grayboxes? The latter is a known op where the binding between its operands and region's arguments is well-defined. |
Rename affine.graybox -> affine.execute_region. Rebase + updates to
context-sensitive valid dim/symbol checking.
I am very sorry, for some reason this diff was not appearing on my phabricator todo list until recently.... Please do not hesitate to ping me by email if I take more than a couple of workdays to iterate.
In general, this change makes sense to me and looks like a proper extension of the affine modeling. My only design concern is that one can circumvent the explicit-capture mechanism for memrefs. A direct solution would be to disallow any values of memref type to be defined in the region, but I have not considered the implications of this on the expressiveness.
There are several places where some action is performed twice (e.g., parsing attribute dict), requesting change for these.
mlir/include/mlir/Dialect/Affine/IR/AffineOps.td | ||
---|---|---|
629–630 | I understand the idea of the restriction, but it looks like it can be circumvented by: %0 = ... : memref<...> %1 = cheat.wrap %0 : memref<...> -> !cheat.opaque affine.execute_region { // This defines the memref inside the region, so seemingly complies with the semantics. %2 = cheat.unwrap %1 : !cheat.opaque -> memref<...> affine.load %2[...] } | |
631–634 | I would avoid referring FuncOp, it is not special in any sense, and it actually allows any terminator. Neither would I remind that blocks must have terminators, it's a core IR requirement. "returns to right after the affine.execute_region op" sounds unnecessarily complex to me. "std.return" terminator placed inside blocks of the "affine.execute_region" returns the control flow to the "affine.execute_region". Since we already mentioned that "execution_region" executes the region exactly ones, it is naturally implied that the control flow will be further transferred to the control-flow successor of "execution_region... | |
655 | Syntax nit: I'd consider omitting empty [] = (), looks distracting | |
mlir/lib/Dialect/Affine/IR/AffineOps.cpp | ||
108 | Nit: wouldn't it be easier to accept the region as argument instead of the operation that contains it? | |
121 | Would variadic template help you? isKnownIsolatedFromAbove isn't a class... | |
128 | The assert message is misleading. The op may have parent ops, just none that satisfy the "affine scope" conditions. Also, use llvm_unreachable instead of assert(false) and drop the return value | |
163–164 | I cannot relate this comment to the code below. | |
166 | Shouldn't this also check for "KnownIsolatedFromAbove" ? | |
1054 | Nit: we tend to use auto when it increases readability, Operation * would look just fine here | |
2276 | Prefer ValueRange to ArrayRef<Value> | |
2285 | Nit: drop trivial braces | |
2288 | If you have ValueRange, this entire vector manipulation gets replaced by memrefs.getTypes() | |
2289 | You already pushed it in line 2295 | |
2325 | Avoid capturing llvm::enumerate by-reference. An enumerator only keeps iterators, so taking copying it is cheap, and we avoid running into potential problems with implicit lifetime extension | |
2327 | Since you already have an enumerator, you might as well mention the position of the mismatching argument. | |
2390 | You already parser the attr dict 4 lines above. |
Thanks for the detailed review! Yes, we need to still converge on the memref explicit capture. But I'll address these other straightforward changes first.
Address review comments.
mlir/include/mlir/Dialect/Affine/IR/AffineOps.td | ||
---|---|---|
629–630 | Independently of affine.execute_region, such a problem exists with memref non-dereferencing ops and it can / has to be tackled. For example, consider this variation of your snippet (no affine.execute_region). %0 = ... : memref<...> %1 = cheat.wrap %0 : memref<...> -> !cheat.opaque %2 = cheat.load %1[0, 0] : !cheat.opaque cheat.store %v, %1[0, 0] : !cheat.opaque call @foo(%1) : (!cheat.opaque) -> () affine.load %0[0, 0] : memref<...> Note that all the current passes/utilities including affine store to load fwd'ing, invariant load hoisting/scalar rep, dependence analysis itself, the affine loop fusion would do the wrong thing here because there is an escape side channel like you show. So, the thing you are pointing to is interesting and has to be tackled, but if we step back and take another look, this is the same as the larger issue of dealing with escaping or aliasing and not specific to execute_region. A solution to deal with these is to actually detect/treat unknown memref non-dereferencing ops that define SSA values (like your cheat.unwrap) and bail out in their presence (depending on what we are doing). For eg. the dependence information isn't going to be accurate in their presence. The point of explicit captures is that you know which memrefs are going in, but it isn't free of the problem of side-channel escapes that manifest in straightline code themselves. But whenever we don't have such escapes (and we can always detect that being conservative), which I believe is the common scenario, having the captures does serve its intended purpose -- you still won't have to look inside and do something special with these ops in all passes/utilities. | |
631–634 | Sure, dropped these lines. | |
mlir/lib/Dialect/Affine/IR/AffineOps.cpp | ||
108 | Actually, it is - thanks! | |
121 | Right - it won't; the comment is probably stale (it was FuncOp and AffineExecuteRegionOp earlier). Anyway this code will be updated to make use of a new op trait 'PolyhedralScope' in another revision, which I'll submit as this one's parent. AffineExecuteRegion will be marked with this trait and other ops that want to define new scopes can have that trait. | |
128 | Thanks. | |
166 | This was checked as part of isValidSymbol above. | |
1054 | Sure. | |
2288 | Sure, this was really old code - that didn't get updated post ValueRange/TypeRange migrations! | |
2289 | Thanks! |
Once again, this disappeared from my attention list... Is it due to "unresolved" grand-parent diff where I'm not listed as a reviewer?
mlir/include/mlir/Dialect/Affine/IR/AffineOps.td | ||
---|---|---|
629–630 | In the general case, I suppose it can be even worse than that. func @foo(%arg0: !cheat.opaque, %arg1: memref<..>) { // opaque and memref alias cheat.do_cheat %arg0 // this affects the memref affine.load %arg1[...] } which looks like we either need a powerful and abstract enough way to describe the aliasing between objects of different types, or to treat any side-effecting operation conservatively. Anyway, I agree with the argument that the proposed approach is no worse than what we already have in Affine and I would like to make progress on this. |
That shouldn't happen logically - if it helps, I can add you as a reviewer there as well!
Rebase on polyhedral scope trait (simplifies this revision) + address remaining concerns
mlir/include/mlir/Dialect/Affine/IR/AffineOps.td | ||
---|---|---|
629–630 |
That's right, thanks. | |
mlir/lib/Dialect/Affine/IR/AffineOps.cpp | ||
2305 | You actually do :-) (Notice the continue above is guarded again inside. ) - although I could better use an else here. | |
2309 | Thanks, but the block arg owner check above also has to be fixed similarly to allow it to be any descendent of 'op'. Done - used a lambda with an all_of. |
mlir/lib/Dialect/Affine/IR/AffineOps.cpp | ||
---|---|---|
2280 | Unfortunately, using this messes up the insertion point subsequently - outweighs the idiomacy it provides and I'd like to avoid using a scope guard. |
@mehdi_amini Could you take a look and let me know if you have any concerns with bringing this in? One thing this unblocks is an inliner into an affine.for/if.
What does this op bring on top of scf.execute_region? Can scf.execute_region carry the AffineScope trait?
Somehow related, in another thread of discussion, we discussed reversing the AffineScope traits: replacing it by a trait that would mean the opposite (let's say NotAnAffineScope as a straw man) and consider every operation that don't define NotAnAffineScope. This would enable to use any op unknown to affine that define a region inside an affine scope, it would just create implicitly a new scope. We could avoid the need to tag operations like FuncOp with AffineScope since that would be the default. Instead only affine operation would need tagging.
mlir/include/mlir/Dialect/Affine/IR/AffineOps.td | ||
---|---|---|
111 | (formatting still to be fixed here) |
Re-read the RFC on "graybox", I am still entirely unconvinced by the current design and the explicit capture of memref used here. I think this is less flexible and not justified all-in-all. believe the more extensible alternative is to support opaque nested regions in general and treat them conservatively as new scope (basically what I wrote above about reversing the trait).
But I'm not willing to die on this hill today, if @ftynse agrees with you that affine should stay a bit more isolated, and that this explicit capture is really worth it right now, then feel free to move forward.
scf.execute_region can carry the affinescope trait (or equivalently can be free of the "ExpandAffineScope" trait which is the complement of AffineScope and which is what we want to replace AffineScope with). The real difference between the two is just going to be memref captures.
Somehow related, in another thread of discussion, we discussed reversing the AffineScope traits: replacing it by a trait that would mean the opposite (let's say NotAnAffineScope as a straw man) and consider every operation that don't define NotAnAffineScope. This would enable to use any op unknown to affine that define a region inside an affine scope, it would just create implicitly a new scope. We could avoid the need to tag operations like FuncOp with AffineScope since that would be the default. Instead only affine operation would need tagging.
Yes, we already discussed all of this. Just for the name, "ExpandsAffineScope" is more meaningful since it's adding to affine scope started by an op above that didn't have that trait. Any region holding op that doesn't have ExpandsAffineScope starts such a scope. Only affine.for, affine.if, and affine.parallel will have this trait. It's the complement of the current AffineScope.
Actually, if you don't want explicit memref captures, you can just use scf.execute_region as is! There would be no difference besides the fact that the latter would be from the scf dialect mixed with affine dialect ops and being looked at by affine passes/utilities etc. for transformation purposes, which is fine. I have felt that having a list of memrefs on the operand list of an affine.execute_region simplifies everything (passes/utilities) doing a walk from the top - they deal with that op transparently just like any other non-region holding op that has memref operands. But I'm willing to hold on to this revision until the use case beyond what can already be served by scf.execute_region arrives. We need to perhaps first think about extending affine passes properly in the presence of scf.execute_region ops and reevaluate this.