This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Introduce op trait PolyhedralScope
ClosedPublic

Authored by bondhugula on Apr 25 2020, 7:29 AM.

Details

Summary

Introduce op trait PolyhedralScope for ops to define a new scope for
polyhedral optimization / affine dialect purposes, thus generalizing
such scopes beyond FuncOp. Ops to which this trait is attached will
define a new scope for the consideration of SSA values as valid symbols
for affine ops and their analysis/transforms. Update methods that check
for dim/symbol validity to work based on this trait.

Note: This op trait could be moved to the affine dialect so that only
those who depend on it are able to use it. However, this would mean that
FuncOp will have to be treated specially in the symbol checks, i.e.,
(FuncOp || op with trait PolyhedralScope) would replace (op with trait
PolyhedralScope) for all purposes. Keeping this trait in the core IR
allows FuncOp to be marked PolyhedralScope.

Diff Detail

Event Timeline

bondhugula created this revision.Apr 25 2020, 7:29 AM

Add trait documentation

Fix comment

Harbormaster completed remote builds in B54679: Diff 260096.
Harbormaster completed remote builds in B54681: Diff 260098.
bondhugula edited the summary of this revision. (Show Details)Apr 26 2020, 3:13 AM
bondhugula removed a reviewer: antiagainst.
ftynse accepted this revision.Apr 27 2020, 6:09 AM

Looks fine to me with comments adressed

mlir/docs/Dialects/Affine.md
64

Nit: can we make this an actual markdown list?

mlir/include/mlir/IR/OpDefinition.h
1060

Shouldn't we rather change that op actually has regions? An op may not have ZeroRegion trait and still have no associated regions.

mlir/lib/Dialect/Affine/IR/AffineOps.cpp
157–158

Did you mean parentOp->hasTrait<PolyhedralScope>() instead of isolated from above?

mlir/test/Dialect/Affine/ops.mlir
122

It is a bit unusual to have a dialect test depend on ops from the test dialect, but looks okay given the nature of the test.

bondhugula marked 8 inline comments as done.

Address review comments

Thanks for the review!

mlir/docs/Dialects/Affine.md
64

Sure - that would be better.

mlir/include/mlir/IR/OpDefinition.h
1060

One could imagine an op that has some instances of it with zero regions and some instances with a region. One would in such cases let the op have the PolyhedralScope trait: we don't have such an example in the code base or a scenario I can think of right away but no regions is actually no harm. So, I thought we'll only disallow if it's statically a zero region op. That said, this check should have actually been a static_assert:

static_assert(ConcreteType::template hasTrait<ZeroRegion>(),
                  "expected operation to have one or more regions");
mlir/lib/Dialect/Affine/IR/AffineOps.cpp
157–158

Thanks - forgot to update this.

mlir/test/Dialect/Affine/ops.mlir
122

Hmm... there is one in parser.mlir also that depends on test.isolated_region; can't think of a way to test these without the test dialect. But once there is affine.execute_region, that could replace this.

rriddle added inline comments.Apr 27 2020, 10:24 AM
mlir/include/mlir/IR/OpDefinition.h
1056

Is this trait applicable to anything outside of the affine dialect? If not I don't really see why it should be here and not in the affine dialect instead.

mlir/lib/Dialect/Affine/IR/AffineOps.cpp
104

value.getParentRegion() == region?

249

Can we use m_Constant instead?

bondhugula added inline comments.Apr 27 2020, 1:08 PM
mlir/include/mlir/IR/OpDefinition.h
1056

I was in two minds on this. Could you see the second para of the commit summary? We'd like to mark FuncOp with PolyhedralScope.

rriddle added inline comments.Apr 27 2020, 1:32 PM
mlir/include/mlir/IR/OpDefinition.h
1056

Ah, sorry I missed that. A lot of times I skip the commit message and jump right in.

Yes, marking FuncOp makes things a bit difficult. This has come up before w.r.t interfaces as well. I'd like to avoid adding special dialect traits here if they aren't general(especially because I'd like to trim this file a bit). I would expect that you could also use FunctionLike(instead of FuncOp) as an indicator of a polyhedral scope. I'm leaning more towards moving the trait to Affine unless you can foresee uses outside of that dialect.

bondhugula added inline comments.Apr 27 2020, 1:59 PM
mlir/include/mlir/IR/OpDefinition.h
1056

There could be things that don't depend on the Affine dialect but would want to use the 'PolyhedralScope' on their FuncLike or other region ops. That's because their lowering could potentially go through the affine dialect much later which their region holding op lasts through - is this meaningful?

But when I initially started with keeping this in the Affine dialect, all the clean and compact code that we have now becomes weird. To be precise:

op->hasTrait<OpTrait::PolyhedralScope> becomes
op->hasTrait<OpTrait::PolyhedralScope> || isa<FuncOp>(op)

We can't right away use FuncLikeOp (although we want to go in that direction) because not all func op's may want to opt into this; you'd want to whitelist the desired ones by attaching the PolyhedralScope trait - they may or may not depend on the affine dialect. Not having PolyhedralScope attached to FuncOp would mean a simple getParentWithTrait<PolyhedralScope> becomes an entire while loop that tries to find an op that either has the PolyhedralScope trait or is a FuncOp.

rriddle added inline comments.Apr 27 2020, 2:06 PM
mlir/include/mlir/IR/OpDefinition.h
1056

I agree that it is easier to manipulate when there is one interface for detection. Can we have an Affine/Traits.h that defines this and just include it in Function.h? It isn't ideal, but we already have to do this for interfaces anyways? I'm just trying to avoid the future where many non-core traits get pulled into IR/ because of the current weirdness with FuncOp.

bondhugula marked 6 inline comments as done.

Create Affine/Traits.h and move polyhedral scope trait there.

bondhugula marked 2 inline comments as done.

Resolve remaining review comments.

rriddle accepted this revision.Apr 27 2020, 3:16 PM

LGTM after resolving the other comments.

mlir/include/mlir/IR/OpBase.td
1633

Can you move this to AffineOpsBase.td or somewhere similar?

This revision is now accepted and ready to land.Apr 27 2020, 3:16 PM
andydavis1 accepted this revision.Apr 27 2020, 5:27 PM

Thanks Uday...

mlir/lib/Dialect/Affine/IR/AffineOps.cpp
139

Same comment as for isSymbol. Make the conditions a list, so its easier for people to see the set of conditions.

224

There are a lot of "ors" in this description, and this is the place most people will be looking for what the definition of what a symbol is. Could you make this a bullet list:
Value can be used as a symbol iff it meets one of the following conditions:
*) It is a constant.
*) It is defined at the top level of 'region'...
*) ...

bondhugula marked 5 inline comments as done.

Resolve remaining review comments.

mlir/include/mlir/IR/OpDefinition.h
1056

Sure - that sounds good. Done.

mlir/lib/Dialect/Affine/IR/AffineOps.cpp
104

Thanks.

224

Sure.

bondhugula edited the summary of this revision. (Show Details)Apr 27 2020, 8:39 PM
bondhugula edited the summary of this revision. (Show Details)

Rebase.

This revision was automatically updated to reflect the committed changes.

Sorry, I reverted this change in https://github.com/llvm/llvm-project/commit/ef06016d73390d5b380018cc0d16003b4ed4a35a. @ftynse understands the problem and will respond here a bit later.

Sorry, I reverted this change in https://github.com/llvm/llvm-project/commit/ef06016d73390d5b380018cc0d16003b4ed4a35a. @ftynse understands the problem and will respond here a bit later.

I had tested this locally with Clang and GCC. The harbormaster tests were already failing due to unrelated clang changes. I'm curious to know what the issue was. Thanks.

Sorry, I reverted this change in https://github.com/llvm/llvm-project/commit/ef06016d73390d5b380018cc0d16003b4ed4a35a. @ftynse understands the problem and will respond here a bit later.

This broke IREE and also was problematic with Bazel builds. There are two issues, and I posit that one of them caused another.

  1. IREE pass pipeline triggered llvm_unreachable("op doesn't have an enclosing polyhedral scope"); (AffineOps.cpp:119). We investigated this and found that this pipeline _first_ transforms std.func into gpu.func, and then runs transformations that depend transitively on Affine within gpu.func. The latter not having the PolyhedralScope trait, and with no other function like op around, the loop in getAffineScope completed and the assertion was triggered. An obvious solution to this is to give gpu.func op the trait. However, I think this failure rather indicates a deeper problem with the affine scope modeling: nothing in the semantics of the Affine Ops requires (and verifies) that they _only_ appear within some op with a PolyhedralScope. So the control flow path with "llvm_unrechable" actually _is_ reachable by valid IR. We can either add such requirement, or consider top-level ModuleOp (which should be there by default) to also be a polyhedral scope. This needs further discussion.
  2. The introduction of Affine/Traits.h created a circular dependency between "IR" and "AffineDialect" libraries (AffineDialect depends on IR because it uses core IR constructs, and IR depends on AffineDialect because Function has a trait defined in Affine). It is actually possible to circumvent the issue in Bazel build files we use, e.g., for TensorFlow, but it is considered to be a hack. Since this specific issue was discussed in the review, let's iterate one more time on it. It party stems from the original lack of separation between core IR and Affine+Standard dialects. Factoring Function out of the "IR" library can be one way to break the cycle. However, the issue of dependency on the affine dialect only for the purpose of having a trait remains. Generally, any dialect with region-carrying ops that wishes to contain ops from the Affine dialect has to depend on it, e.g. GPU. This sounds wrong layering-wise. Since we don't want dialect-specific traits to live in IR, another possible separation is to have dialect-specific traits live in a library (and filepath) separate from both IR and Dialect, with both IR and Dialect depending on this library.

Sorry, I reverted this change in https://github.com/llvm/llvm-project/commit/ef06016d73390d5b380018cc0d16003b4ed4a35a. @ftynse understands the problem and will respond here a bit later.

This broke IREE and also was problematic with Bazel builds. There are two issues, and I posit that one of them caused another.

  1. IREE pass pipeline triggered llvm_unreachable("op doesn't have an enclosing polyhedral scope"); (AffineOps.cpp:119). We investigated this and found that this pipeline _first_ transforms std.func into gpu.func, and then runs transformations that depend transitively on Affine within gpu.func. The latter not having the PolyhedralScope trait, and with no other function like op around, the loop in getAffineScope completed and the assertion was triggered. An obvious solution to this is to give gpu.func op the trait. However, I think this failure rather indicates a deeper problem with the affine scope modeling: nothing in the semantics of the Affine Ops requires (and verifies) that they _only_ appear within some op with a PolyhedralScope. So the control flow path with "llvm_unrechable" actually _is_ reachable by valid IR. We can either add such requirement, or consider top-level ModuleOp (which should be there by default) to also be a polyhedral scope. This needs further discussion.

As you hint, marking the gpu.func with PolyhedralScope isn't really a solution, although we may want to do that in the future. Marking the ModuleOp PolyhedralScope appears to be the right fix to me - the root op is always a valid polyhedral scope.

  1. The introduction of Affine/Traits.h created a circular dependency between "IR" and "AffineDialect" libraries (AffineDialect depends on IR because it uses core IR constructs, and IR depends on AffineDialect because Function has a trait defined in Affine). It is actually possible to circumvent the issue in Bazel build files we use, e.g., for TensorFlow, but it is considered to be a hack. Since this specific issue was discussed in the review, let's iterate one more time on it. It party stems from the original lack of separation between core IR and Affine+Standard dialects. Factoring Function out of the "IR" library can be one way to break the cycle. However, the issue of dependency on the affine dialect only for the purpose of having a trait remains. Generally, any dialect with region-carrying ops that wishes to contain ops from the Affine dialect has to depend on it, e.g. GPU. This sounds wrong layering-wise. Since we don't want dialect-specific traits to live in IR, another possible separation is to have dialect-specific traits live in a library (and filepath) separate from both IR and Dialect, with both IR and Dialect depending on this library.

This was due to oversight. It was River's suggestion to include Affine/Traits.h from Function, but he perhaps intended for it to be in IR/ - I earlier had it in OpDefinition.h where it didn't create a cyclic dependence.

Sorry, I reverted this change in https://github.com/llvm/llvm-project/commit/ef06016d73390d5b380018cc0d16003b4ed4a35a. @ftynse understands the problem and will respond here a bit later.

This broke IREE and also was problematic with Bazel builds. There are two issues, and I posit that one of them caused another.

  1. IREE pass pipeline triggered llvm_unreachable("op doesn't have an enclosing polyhedral scope"); (AffineOps.cpp:119). We investigated this and found that this pipeline _first_ transforms std.func into gpu.func, and then runs transformations that depend transitively on Affine within gpu.func. The latter not having the PolyhedralScope trait, and with no other function like op around, the loop in getAffineScope completed and the assertion was triggered. An obvious solution to this is to give gpu.func op the trait. However, I think this failure rather indicates a deeper problem with the affine scope modeling: nothing in the semantics of the Affine Ops requires (and verifies) that they _only_ appear within some op with a PolyhedralScope. So the control flow path with "llvm_unrechable" actually _is_ reachable by valid IR. We can either add such requirement, or consider top-level ModuleOp (which should be there by default) to also be a polyhedral scope. This needs further discussion.

As you hint, marking the gpu.func with PolyhedralScope isn't really a solution, although we may want to do that in the future. Marking the ModuleOp PolyhedralScope appears to be the right fix to me - the root op is always a valid polyhedral scope.

I could not convince myself it would work in that IREE case, but it probably would. isValidDim was not checking for GPUFuncOp as a symbol definition previously, so it shouldn't affect the behavior.

  1. The introduction of Affine/Traits.h created a circular dependency between "IR" and "AffineDialect" libraries (AffineDialect depends on IR because it uses core IR constructs, and IR depends on AffineDialect because Function has a trait defined in Affine). It is actually possible to circumvent the issue in Bazel build files we use, e.g., for TensorFlow, but it is considered to be a hack. Since this specific issue was discussed in the review, let's iterate one more time on it. It party stems from the original lack of separation between core IR and Affine+Standard dialects. Factoring Function out of the "IR" library can be one way to break the cycle. However, the issue of dependency on the affine dialect only for the purpose of having a trait remains. Generally, any dialect with region-carrying ops that wishes to contain ops from the Affine dialect has to depend on it, e.g. GPU. This sounds wrong layering-wise. Since we don't want dialect-specific traits to live in IR, another possible separation is to have dialect-specific traits live in a library (and filepath) separate from both IR and Dialect, with both IR and Dialect depending on this library.

This was due to oversight. It was River's suggestion to include Affine/Traits.h from Function, but he perhaps intended for it to be in IR/ - I earlier had it in OpDefinition.h where it didn't create a cyclic dependence.

He mentioned there is already a similar circular dependency between Interfaces and IR, which is indeed the case, so it looks like he meant what you did...

@ftynse, @rriddle As I fix this, should the commit title / summary be "Revert "Revert ...."" or should it just be the original title and summary "[MLIR] Introduce ..."? Thanks.

@ftynse, @rriddle As I fix this, should the commit title / summary be "Revert "Revert ...."" or should it just be the original title and summary "[MLIR] Introduce ..."? Thanks.

I think any mention of the previous commit series is fine.