This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Introduce affine.execute_region op
Changes PlannedPublic

Authored by bondhugula on Jan 4 2020, 9:41 PM.

Details

Summary

The affine.execute_region op executes its region exactly once while
defining a new polyhedral scope for its region for analysis and
transformation purposes, i.e., a new symbol context is defined for
operations appearing in its region. This allows the polyhedral form to
be used in a wider context without the need for function outlining.
The op explicitly captures only memrefs and is lowered readily to an
std.execute_region.

Diff Detail

Event Timeline

bondhugula created this revision.Jan 4 2020, 9:41 PM
bondhugula updated this revision to Diff 236224.Jan 4 2020, 9:42 PM
bondhugula removed a reviewer: nicolasvasilache.
  • rename getParentAffineScope -> getAffineScope

Update doc comments.

  • invalid test cases
  • update affine dialect doc
bondhugula updated this revision to Diff 236231.Jan 5 2020, 3:01 AM
  • fix comments
bondhugula updated this revision to Diff 236236.Jan 5 2020, 4:59 AM
  • add missed result parsing
  • fix verifier
mehdi_amini requested changes to this revision.Jan 5 2020, 5:49 AM
mehdi_amini added inline comments.
mlir/docs/Dialects/Affine.md
612 ↗(On Diff #236236)

I don’t think you addressed my concerns on this topic?

This revision now requires changes to proceed.Jan 5 2020, 5:49 AM
bondhugula marked an inline comment as done.Jan 5 2020, 7:06 AM
bondhugula added inline comments.
mlir/docs/Dialects/Affine.md
612 ↗(On Diff #236236)

I think I responded to everything and included all of the arguments in the RFC:
https://github.com/bondhugula/llvm-project/blob/graybox/mlir/rfc/rfc-graybox.md
Could you just provide a summary list of the concerns you still have - either here or on that thread as you prefer?

mehdi_amini added inline comments.Jan 5 2020, 2:35 PM
mlir/docs/Dialects/Affine.md
612 ↗(On Diff #236236)

I explained my concerns in the original thread https://groups.google.com/a/tensorflow.org/d/msg/mlir/O5PXVbtlSng/3SXmxDiLAAAJ

Here is what I wrote:

I am trying to not consider affine at all here. I wrote these example to try to illustrate how MLIR region/op interaction are structured opaquely to be able to derive cross-dialect invariants in general.
The invariant I am presenting above is independent from any dialect, let me abstract the type further:

func @foo(%value : !dialect.type) {
op.with_region {
any.op(%value) : (!dialect.type) -> ()
}
}

If I look at it generically, here is my take on it:
a) The op.with_region defines the semantic of its immediate region, so it can either accept or reject any.op.
b) Let's assume that op.with_region does not know anything about any.op (no traits, no prior knowledge, the any.op could be unregistered at this point).
c) For this IR to be valid, op.with_region must be accepting unknown op (like any.op).
d) From the perspective of op.with_region, the any.op is like an opaque call to some code it cannot see.

But what if any.op has a region?

func @foo(%value : !dialect.type) {
op.with_region {
any.op(%value) ({
^bb(%value_inside):
// do something with %value_inside (explicitly captured)
}) : (!dialect.type) -> ()
}}

Here what changes is that:
e) any.op has a region now. Unless op.with_region forbids unknown operation from having a region in its verifier, and it does not have specific handling for any.op, then the IR should be valid.
f) Since %value_inside is explicitly captured, without knowing specifically any.op, then the uses of %value_inside cannot be restricted by op.with_region but only by any_op.
g) For any practical purpose here, there should not be any difference between this form and the first one above.

Finally, what if any.op has implicit capture?

func @foo(%value : !dialect.type) {
op.with_region {
any.op() ({
// do something with %value (implicit capture instead of explicit)
}) : (!dialect.type) -> ()
}}

Now:
h) any.op() is implicitly capturing %value.
g) Without more information about any.op (traits, etc.), this should be equivalent to the explicit capture case: if the IR was valid the first and second case, then it should be valid here.

If we don't have these properties, and if op.with_region can constrain the validity of the region attached to any.op, then any.op is not longer in control of the semantics of the enclosed region. No transformation can operate on any.op without knowing all of the enclosing operations, since these can add arbitrary restrictions.
For example, this is a valid IR (you can pipe this in mlir-opt right now):

module {
"d1.op1" () ({
"d2.op2" () ({
module {
func @bar() {
return
}
func @foo() {
call @bar() : () -> ()
return
}
}
"d2.done" () : () -> ()
}) : () -> ()
}) : () -> ()
}

If I get the inner @foo function, and would like to inline the call to @bar, what do I need to check to ensure I can?
If the FuncOp defines the semantic of the region, then the FuncOp should control itself whether it allows to inline or not, and I should query FuncOp for @foo, CallOp for the call-site, FuncOp for @bar, and likely the op inside @bar that I am about to inline.

If you allow to put restriction on what can happen inside @foo(), based on the enclosing operation, then you can't inline unless you ensure that all the enclosing operation will be happy with it (so you need to check the enclosing modules, but also "d1.op1" and "d2.op2").

Basically, this would be breaking the composability of the IR: you couldn't assemble independent pieces and reason about them independently. I don't know why we would want that, here we really want to reason about the functions in the inner module independently if they are surrounded by "d1.op1"() and "d2.op2"() like here (otherwise none of the current passes in MLIR are correct).

I don't think you answered these points that explain why I am not convinced it is OK to have explicit capture just for memref.
I don't think it is necessary to have explicit capture of memref either by the way, dropping this may help getting forward right now.

You answered the email above in the thread, but you didn't address it I believe, you wrote "I'll respond to the connection to the affine.graybox proposal in another post" but I didn't see another post after that.

(I'm on vacation till 1/10, expect some delays in answers)

bondhugula marked an inline comment as done.Jan 5 2020, 8:31 PM
bondhugula added inline comments.
mlir/docs/Dialects/Affine.md
612 ↗(On Diff #236236)

The downsides of not explicitly capturing memrefs on the graybox are discussed in the RFC here.
https://github.com/bondhugula/llvm-project/blob/graybox/mlir/rfc/rfc-graybox.md#rationale-and-design-alternatives-what-to-explicitly-capture

Another way to think about this is that: because grayboxes introduce a new symbol context, most polyhedral walks would like to conceptually see the graybox just as a "call" with those memrefs passed to start with. Other arbitrary/unknown ops with regions don't start a symbol context and so the affine walks will just walk through such ops (just like they walk through affine.fors/ifs that are encountered). OTOH, walkAffine will not walk through grayboxes from the top. If you don't explicitly capture, the key is that most polyhedral/affine passes will have to stop/check every op for a graybox and then scan the interior of that op for memrefs if it turns out to be a graybox. (For the future, this would even go against multithreading polyhedral passes to run concurrently on different func's and grayboxes in an isolated way, but I'm not bringing this up now in the RFC). With an explicit capture, things would just be a regular operand scan of ops scanned with walkAffine. For the future cases where you really need more precise information on what's happening to the memrefs inside the graybox (just like you may want to in the case of call's), that can be done as needed. Non-memref values on the other hand just move transparently across the boundaries of grayboxes in the regular SSA passes/canonicalizations or hybrid polyhedral/SSA ones.

My concerns with explicit capture are actually very different from yours: that they make it harder to move IR across without actually updating the memrefs being used (either hoist from inside to outside or sink). You'd have to check if you are moving past a graybox and then remap memrefs (consider scalar replacement on affine load/stores, LICM as examples).

But I still strongly feel that explicit capture of the memrefs is the right tradeoff to start with (even if perhaps not the right one eventually) - we can reevaluate its impact and drop if necessary.

Best,
Uday

bondhugula marked an inline comment as done.Jan 6 2020, 12:59 AM
bondhugula added inline comments.
mlir/docs/Dialects/Affine.md
612 ↗(On Diff #236236)

And finally reg. your points on unknown ops with regions that explicitly capture in addition to perhaps implicitly: that's an issue separate from affine.graybox and that exists in the codebase as is now. If you have an op whose operands bind to its region's arguments in ways that are unknown, the current affine passes/utilities don't check for example if the memrefs inside are shadows, how memory footprints / dependences should be accurate computed, etc. (in fact, they are already incorrect because they'd walk through that op and treat those memrefs as distinct). The solution to this would depend on the pass/utility: some should just do nothing in the presence of unknown ops that have one or more regions (we shouldn't even say 'explicit capture' here because we don't even know if it's really capturing in spite of having operands). Since that same op could also implicitly capture (as you mention in one case), choosing to not look inside may not be an option depending on the pass (unless of course the op is known to be isolated from above). But how is all this related to explicit capture in affine.grayboxes? The latter is a known op where the binding between its operands and region's arguments is well-defined.

bondhugula retitled this revision from [LLVM] [MLIR] Introduce affine graybox op to [MLIR] Introduce affine graybox op.Mar 24 2020, 6:35 AM
bondhugula edited the summary of this revision. (Show Details)

Rebase.

changes after rebase - unregistered ops

Rename affine.graybox -> affine.execute_region. Rebase + updates to
context-sensitive valid dim/symbol checking.

bondhugula retitled this revision from [MLIR] Introduce affine graybox op to [MLIR] Introduce affine.execute_region op.Apr 18 2020, 1:30 PM
bondhugula edited the summary of this revision. (Show Details)
bondhugula removed a reviewer: rriddle.
bondhugula edited the summary of this revision. (Show Details)
ftynse requested changes to this revision.Apr 24 2020, 6:50 AM

I am very sorry, for some reason this diff was not appearing on my phabricator todo list until recently.... Please do not hesitate to ping me by email if I take more than a couple of workdays to iterate.

In general, this change makes sense to me and looks like a proper extension of the affine modeling. My only design concern is that one can circumvent the explicit-capture mechanism for memrefs. A direct solution would be to disallow any values of memref type to be defined in the region, but I have not considered the implications of this on the expressiveness.

There are several places where some action is performed twice (e.g., parsing attribute dict), requesting change for these.

mlir/include/mlir/Dialect/Affine/IR/AffineOps.td
1115–1116

I understand the idea of the restriction, but it looks like it can be circumvented by:

%0 = ... : memref<...>
%1 = cheat.wrap %0 : memref<...> -> !cheat.opaque
affine.execute_region {
  // This defines the memref inside the region, so seemingly complies with the semantics.
  %2 = cheat.unwrap %1 : !cheat.opaque -> memref<...>
  affine.load %2[...]
}
1117–1120

I would avoid referring FuncOp, it is not special in any sense, and it actually allows any terminator. Neither would I remind that blocks must have terminators, it's a core IR requirement.

"returns to right after the affine.execute_region op" sounds unnecessarily complex to me. "std.return" terminator placed inside blocks of the "affine.execute_region" returns the control flow to the "affine.execute_region". Since we already mentioned that "execution_region" executes the region exactly ones, it is naturally implied that the control flow will be further transferred to the control-flow successor of "execution_region...

1141

Syntax nit: I'd consider omitting empty [] = (), looks distracting

mlir/lib/Dialect/Affine/IR/AffineOps.cpp
234

Nit: wouldn't it be easier to accept the region as argument instead of the operation that contains it?

247

Would variadic template help you? isKnownIsolatedFromAbove isn't a class...

254

The assert message is misleading. The op may have parent ops, just none that satisfy the "affine scope" conditions.

Also, use llvm_unreachable instead of assert(false) and drop the return value

319–320

I cannot relate this comment to the code below.

322

Shouldn't this also check for "KnownIsolatedFromAbove" ?

1096

Nit: we tend to use auto when it increases readability, Operation * would look just fine here

2669

Prefer ValueRange to ArrayRef<Value>

2678

Nit: drop trivial braces

2681

If you have ValueRange, this entire vector manipulation gets replaced by memrefs.getTypes()

2682

You already pushed it in line 2295

2718

Avoid capturing llvm::enumerate by-reference. An enumerator only keeps iterators, so taking copying it is cheap, and we avoid running into potential problems with implicit lifetime extension

2720

Since you already have an enumerator, you might as well mention the position of the mismatching argument.

2783

You already parser the attr dict 4 lines above.

This revision now requires changes to proceed.Apr 24 2020, 6:50 AM

I am very sorry, for some reason this diff was not appearing on my phabricator todo list until recently.... Please do not hesitate to ping me by email if I take more than a couple of workdays to iterate.

In general, this change makes sense to me and looks like a proper extension of the affine modeling. My only design concern is that one can circumvent the explicit-capture mechanism for memrefs. A direct solution would be to disallow any values of memref type to be defined in the region, but I have not considered the implications of this on the expressiveness.

There are several places where some action is performed twice (e.g., parsing attribute dict), requesting change for these.

Thanks for the detailed review! Yes, we need to still converge on the memref explicit capture. But I'll address these other straightforward changes first.

bondhugula marked 26 inline comments as done.

Address review comments.

mlir/include/mlir/Dialect/Affine/IR/AffineOps.td
1115–1116

Independently of affine.execute_region, such a problem exists with memref non-dereferencing ops and it can / has to be tackled. For example, consider this variation of your snippet (no affine.execute_region).

%0 = ... : memref<...>
%1 = cheat.wrap %0 : memref<...> -> !cheat.opaque
%2 = cheat.load %1[0, 0] : !cheat.opaque
cheat.store %v, %1[0, 0] : !cheat.opaque
call @foo(%1) : (!cheat.opaque) -> ()
affine.load %0[0, 0] : memref<...>

Note that all the current passes/utilities including affine store to load fwd'ing, invariant load hoisting/scalar rep, dependence analysis itself, the affine loop fusion would do the wrong thing here because there is an escape side channel like you show. So, the thing you are pointing to is interesting and has to be tackled, but if we step back and take another look, this is the same as the larger issue of dealing with escaping or aliasing and not specific to execute_region. A solution to deal with these is to actually detect/treat unknown memref non-dereferencing ops that define SSA values (like your cheat.unwrap) and bail out in their presence (depending on what we are doing). For eg. the dependence information isn't going to be accurate in their presence. The point of explicit captures is that you know which memrefs are going in, but it isn't free of the problem of side-channel escapes that manifest in straightline code themselves. But whenever we don't have such escapes (and we can always detect that being conservative), which I believe is the common scenario, having the captures does serve its intended purpose -- you still won't have to look inside and do something special with these ops in all passes/utilities.

1117–1120

Sure, dropped these lines.

mlir/lib/Dialect/Affine/IR/AffineOps.cpp
234

Actually, it is - thanks!

247

Right - it won't; the comment is probably stale (it was FuncOp and AffineExecuteRegionOp earlier). Anyway this code will be updated to make use of a new op trait 'PolyhedralScope' in another revision, which I'll submit as this one's parent. AffineExecuteRegion will be marked with this trait and other ops that want to define new scopes can have that trait.

254

Thanks.

322

This was checked as part of isValidSymbol above.

1096

Sure.

2681

Sure, this was really old code - that didn't get updated post ValueRange/TypeRange migrations!

2682

Thanks!

bondhugula edited the summary of this revision. (Show Details)Apr 26 2020, 8:43 PM
ftynse accepted this revision.Apr 27 2020, 6:31 AM

Once again, this disappeared from my attention list... Is it due to "unresolved" grand-parent diff where I'm not listed as a reviewer?

mlir/include/mlir/Dialect/Affine/IR/AffineOps.td
1115–1116

In the general case, I suppose it can be even worse than that.

func @foo(%arg0: !cheat.opaque, %arg1: memref<..>) { // opaque and memref alias
  cheat.do_cheat %arg0 // this affects the memref
  affine.load %arg1[...]
}

which looks like we either need a powerful and abstract enough way to describe the aliasing between objects of different types, or to treat any side-effecting operation conservatively.

Anyway, I agree with the argument that the proposed approach is no worse than what we already have in Affine and I would like to make progress on this.

mehdi_amini added inline comments.Apr 27 2020, 1:12 PM
mlir/lib/Dialect/Affine/IR/AffineOps.cpp
2698

You don't need an if here.

2702

Seems like you can fix the FIXME and just write: if(op->isProperAncestor(memref.getDefiningOp()) continue;

Once again, this disappeared from my attention list... Is it due to "unresolved" grand-parent diff where I'm not listed as a reviewer?

That shouldn't happen logically - if it helps, I can add you as a reviewer there as well!

bondhugula marked 6 inline comments as done.

Rebase on polyhedral scope trait (simplifies this revision) + address remaining concerns

mlir/include/mlir/Dialect/Affine/IR/AffineOps.td
1115–1116

which looks like we either need a powerful and abstract enough way to describe the
aliasing between objects of different types, or to treat any side-effecting operation

That's right, thanks.

mlir/lib/Dialect/Affine/IR/AffineOps.cpp
2698

You actually do :-) (Notice the continue above is guarded again inside. ) - although I could better use an else here.

2702

Thanks, but the block arg owner check above also has to be fixed similarly to allow it to be any descendent of 'op'. Done - used a lambda with an all_of.

Herald added a project: Restricted Project. · View Herald TranscriptJun 18 2021, 7:29 PM
bondhugula edited the summary of this revision. (Show Details)Jun 18 2021, 7:29 PM

Rebase on upstream tip. Bring code to date.

Minor update to test cases.

bondhugula added inline comments.Jun 19 2021, 2:35 AM
mlir/include/mlir/Dialect/Affine/IR/AffineOps.td
1141

@ftynse Unfortunately, the parser API won't support this unless we want to disallow [] = () and even there we won't be able to emit the right error message. (Basically we won't know whether to parse the =.)

ftynse accepted this revision.Jun 21 2021, 4:16 AM
ftynse added inline comments.
mlir/include/mlir/Dialect/Affine/IR/AffineOps.td
120
mlir/lib/Dialect/Affine/IR/AffineOps.cpp
2673

We can now do builder.createBlock

Address review comments.

bondhugula marked an inline comment as done.

Avoid createBlock

bondhugula added inline comments.Jun 27 2021, 8:22 PM
mlir/lib/Dialect/Affine/IR/AffineOps.cpp
2673

Unfortunately, using this messes up the insertion point subsequently - outweighs the idiomacy it provides and I'd like to avoid using a scope guard.

@mehdi_amini Could you take a look and let me know if you have any concerns with bringing this in? One thing this unblocks is an inliner into an affine.for/if.

What does this op bring on top of scf.execute_region? Can scf.execute_region carry the AffineScope trait?

Somehow related, in another thread of discussion, we discussed reversing the AffineScope traits: replacing it by a trait that would mean the opposite (let's say NotAnAffineScope as a straw man) and consider every operation that don't define NotAnAffineScope. This would enable to use any op unknown to affine that define a region inside an affine scope, it would just create implicitly a new scope. We could avoid the need to tag operations like FuncOp with AffineScope since that would be the default. Instead only affine operation would need tagging.

mlir/include/mlir/Dialect/Affine/IR/AffineOps.td
120

(formatting still to be fixed here)

Re-read the RFC on "graybox", I am still entirely unconvinced by the current design and the explicit capture of memref used here. I think this is less flexible and not justified all-in-all. believe the more extensible alternative is to support opaque nested regions in general and treat them conservatively as new scope (basically what I wrote above about reversing the trait).
But I'm not willing to die on this hill today, if @ftynse agrees with you that affine should stay a bit more isolated, and that this explicit capture is really worth it right now, then feel free to move forward.

bondhugula added a comment.EditedJun 29 2021, 2:54 AM

What does this op bring on top of scf.execute_region? Can scf.execute_region carry the AffineScope trait?

scf.execute_region can carry the affinescope trait (or equivalently can be free of the "ExpandAffineScope" trait which is the complement of AffineScope and which is what we want to replace AffineScope with). The real difference between the two is just going to be memref captures.

Somehow related, in another thread of discussion, we discussed reversing the AffineScope traits: replacing it by a trait that would mean the opposite (let's say NotAnAffineScope as a straw man) and consider every operation that don't define NotAnAffineScope. This would enable to use any op unknown to affine that define a region inside an affine scope, it would just create implicitly a new scope. We could avoid the need to tag operations like FuncOp with AffineScope since that would be the default. Instead only affine operation would need tagging.

Yes, we already discussed all of this. Just for the name, "ExpandsAffineScope" is more meaningful since it's adding to affine scope started by an op above that didn't have that trait. Any region holding op that doesn't have ExpandsAffineScope starts such a scope. Only affine.for, affine.if, and affine.parallel will have this trait. It's the complement of the current AffineScope.

Re-read the RFC on "graybox", I am still entirely unconvinced by the current design and the explicit capture of memref used here. I think this is less flexible and not justified all-in-all. believe the more extensible alternative is to support opaque nested regions in general and treat them conservatively as new scope (basically what I wrote above about reversing the trait).
But I'm not willing to die on this hill today, if @ftynse agrees with you that affine should stay a bit more isolated, and that this explicit capture is really worth it right now, then feel free to move forward.

Actually, if you don't want explicit memref captures, you can just use scf.execute_region as is! There would be no difference besides the fact that the latter would be from the scf dialect mixed with affine dialect ops and being looked at by affine passes/utilities etc. for transformation purposes, which is fine. I have felt that having a list of memrefs on the operand list of an affine.execute_region simplifies everything (passes/utilities) doing a walk from the top - they deal with that op transparently just like any other non-region holding op that has memref operands. But I'm willing to hold on to this revision until the use case beyond what can already be served by scf.execute_region arrives. We need to perhaps first think about extending affine passes properly in the presence of scf.execute_region ops and reevaluate this.

bondhugula planned changes to this revision.Oct 26 2021, 3:14 AM