Page MenuHomePhabricator

[MLIR] Add RegionKindInterface
Needs ReviewPublic

Authored by stephenneuendorffer on May 20 2020, 10:31 PM.

Details

Summary

Some dialects have semantics which is not well represented by common
SSA structures with dominance constraints. This patch allows
operations to declare the 'kind' of their contained regions.
Currently, two kinds are allowed: "SSACFG" and "Graph". The only
difference between them at the moment is that SSACFG regions are
required to have dominance, while Graph regions are not required to
have dominance. The intention is that this Interface would be
generated by ODS for existing operations, although this has not yet
been implemented. Presumably, if someone were interested in code
generation, we might also have a "CFG" dialect, which defines control
flow, but does not require SSA.

The new behavior is mostly identical to the previous behavior, since
registered operations without a RegionKindInterface are assumed to
contain SSACFG regions. However, the behavior has changed for
unregistered operations. Previously, these were checked for
dominance, however the new behavior allows dominance violations, in
order to allow the processing of unregistered dialects with Graph
regions. One implication of this is that regions in unregistered
operations with more than one op are no longer CSE'd (since it
requires dominance info).

I've also reorganized the LangRef documentation to remove assertions
about "sequential execution", "SSA Values", and "Dominance". Instead,
the core IR is simply "ordered" (i.e. totally ordered) and consists of
"Values". I've also clarified some things about how control flow
passes between blocks in an SSACFG region. Control Flow must enter a
region at the entry block and follow terminator operation successors
or be returned to the containing op. Graph regions do not define a
notion of control flow.

see discussion here:
https://llvm.discourse.group/t/rfc-allowing-dialects-to-relax-the-ssa-dominance-condition/833/53

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
rriddle added inline comments.Tue, Jun 16, 12:40 AM
mlir/include/mlir/IR/RegionKindInterface.h
14 ↗(On Diff #270965)

Please fix the header guard style.

17 ↗(On Diff #270965)

Can you trim these?

26 ↗(On Diff #270965)

Please add documentation here on what these are and what they mean.

29 ↗(On Diff #270965)

Graph? SSACFG contains acronyms so I see why they are capitalized.

mlir/include/mlir/IR/RegionKindInterface.td
27 ↗(On Diff #270965)

nit: Can you reflow this and move it to the next line? The doc formatter can handle this automatically.

27 ↗(On Diff #270965)

Missing period at the end of the summary.

30 ↗(On Diff #270965)

nit: Remove double space.

39 ↗(On Diff #270965)

nit: Please use multi line string for this description.

47 ↗(On Diff #270965)

Leftover debugging?

55 ↗(On Diff #270965)

Please remove.

60 ↗(On Diff #270965)

Please fix the header guard style.

mlir/include/mlir/Interfaces/CMakeLists.txt
7 ↗(On Diff #270965)

Seems unrelated.

mlir/lib/IR/RegionKindInterface.cpp
18 ↗(On Diff #270965)

I don't think this is necessary.

mlir/lib/IR/Verifier.cpp
31

I don't think this is necessary.

This revision now requires changes to proceed.Tue, Jun 16, 12:40 AM
rriddle added inline comments.Tue, Jun 16, 12:42 AM
mlir/test/lib/Dialect/Test/TestDialect.cpp
227

Please remove the mlir:: from each of these.

stephenneuendorffer marked 49 inline comments as done.Tue, Jun 16, 7:10 PM

Update after review changes

mlir/docs/Interfaces.md
232 ↗(On Diff #270965)

Certainly! I figure writing a few by hand would lead us to that.

mlir/docs/LangRef.md
33 ↗(On Diff #270250)

Yes, exactly. The point of this is that the order of instructions in the IR does not need to correspond to the execution order of instructions in a sequential machine. Even in a graph, the IR still maintains an order, which enables deterministic round-tripping, for instance.

41 ↗(On Diff #270250)

What is an "Operation Definition" other than traits, interfaces, and verification constraints? operand and result names? successor names? I've reworded this section.

539 ↗(On Diff #270250)

@rriddle I don't understand whether you are disagreeing with Mehdi or me.

I think that if ops are unregistered we should be pretty liberal about what should be allowed. Structually, the only requirement I've imposed on regions of unregistered ops is that all value identifiers must be defined. In particular, I have not required that regions of unregistered operations must be either 1) a single block or 2) a valid SSACFG region with dominance. (I think this was suggested in the discussion)

546 ↗(On Diff #270250)

I was also thinking about something co-routine like where Blocks could yield to another block chosen by the containing operation.

468 ↗(On Diff #270965)

How so? The point of this is to show the structures that are important and that are the requirements on those structures, so I'm trying to define a region structurally separate from the CFG semantics. What would you add to this, other than "it's a CFG?"

545 ↗(On Diff #270965)

we don't call them "potential successors' anywhere else. I think the meaning is clear from the following sentence

stephenneuendorffer marked 3 inline comments as done.
rriddle marked an inline comment as done.Tue, Jun 16, 7:27 PM
rriddle added inline comments.
mlir/docs/LangRef.md
546 ↗(On Diff #270250)

You could also implement coroutines with an explicit switch in the entry block that contains each of the potential resume points.

stephenneuendorffer retitled this revision from [MLIR][InProgress] Add RegionKindInterface to [MLIR] Add RegionKindInterface.
stephenneuendorffer edited the summary of this revision. (Show Details)
rriddle added inline comments.Wed, Jun 17, 5:29 PM
mlir/lib/IR/Verifier.cpp
238

I don't think this is correct. You are missing checks for how dominance interacts across different regions. Even if there is no intra-region dominance, there are still constraints how values may be used across regions. For example, the operand values have to be defined in this region or an ancestor region. This is also missing checks on how verification happens when the value is defined in a region with dominance and used in a region without. The dominance relationship between the ancestor operation in the region with dominance must still be checked, i.e.,

// region with dominance
func @foo() {
  // region without dominance
  bar.non_dominance_region {
    // Verifier failure: This is a non-dominating use.
    bar.use %foo_value
    bar.return
  }
  %foo_value = constant true
mehdi_amini added inline comments.Wed, Jun 17, 8:14 PM
mlir/docs/LangRef.md
33 ↗(On Diff #270250)

Yes, exactly

I was pointing out something that seems contradictory, so I'm puzzled by your answer here?

Even in a graph, the IR still maintains an order, which enables deterministic round-tripping, for instance.

Yes but there is still no total order since you have cycles.

The point of this is that the order of instructions in the IR does not need to correspond to the execution order of instructions in a sequential machine.

OK but the way it is written is still confusing to me. You're also using "Region" as if it is a well defined term while it is the first use in the doc.
I'll look for an alternative writing after we settle on the description of multi-blocks region below.

Operations are also organized into [Blocks](#blocks) and are totally ordered
within their containing [Region](#regions)

539 ↗(On Diff #270250)

In particular, I have not required that regions of unregistered operations must be either 1) a single block or 2) a valid SSACFG region with dominance. (I think this was suggested in the discussion)

This complicates the design, and I don't believe your specification is complete right now with respect to non-CFG multi-blocks region.

The paragraph that follows is a good example, when it says that when control flow enters a region, it always begins in the first block of the region, called the *entry* block we already have a notion of control for the blocks. Then Control flow can only pass to one of the specified successor blocks as in a branch operation, or back to the containing operation as in a return operation. Terminator operations without successors can only pass.
In your example above:

"unregistered_op" () ({
^bb1:
  %2 = %1
  %1 = %2
^bb2:
  %3 = %2
}

The second block is clearly unreachable and can just be deleted.

It also means that:

"unregistered_op" () ({
^bb1:
  %2 = %3
  %1 = %2
^bb2:
  %3 = ...
}

Would be invalid IR.

stephenneuendorffer marked 3 inline comments as done.Thu, Jun 18, 8:36 AM
stephenneuendorffer added inline comments.
mlir/docs/LangRef.md
33 ↗(On Diff #270250)

OK.. I see there being several relationships here. One is the total order of operations in a block and blocks in a region. There is a separate relationship which is the data dependence relationship, which I don't think is actually a partial order on operations in Graph regions or SSACFG regions. Certainly not if terminator edges are included.

539 ↗(On Diff #270250)

My intention is that the "Control Flow" section here only applies to SSACFG regions. Maybe there needs to be a separate section on Graph regions. Your example illustrates why 'reachability' of basic blocks is probably not a great concept in graph regions.

mlir/lib/IR/Verifier.cpp
238

Right... I think some more test cases are in order here.

mehdi_amini added inline comments.Fri, Jun 19, 1:20 AM
mlir/docs/LangRef.md
468 ↗(On Diff #270965)

I see the structure as trivial: only the semantics is complex (i.e. reachability, control-flow, etc.).

So I'm not sure what's the point of long description of what could be shown with a few lines of grammar specification (or even 10 lines of C++).

bollu added a subscriber: bollu.Wed, Jun 24, 11:24 AM
stephenneuendorffer edited the summary of this revision. (Show Details)
stephenneuendorffer marked 8 inline comments as done.
mlir/docs/LangRef.md
468 ↗(On Diff #270965)

I think the structure is actually harder to grok than you might think for outsiders. Also, I think the syntactic issues of how it is represented as a grammar doesn't necessarily reflect what you need to know to use it from c++. (In fact, I've been considering if these need to be separated further, having a separate page for an abstract description and leave most of what is currently in LangRef.md as a description of the concrete grammar syntax.)

I think one of the things that is missing right now is that the semantics is quite flexible, but not without bounds. Also the syntax is quite flexible, but also not without bounds. I see this as a step towards being explicit about those bounds, at least on the semantic side.

mlir/lib/IR/Dominance.cpp
37–56 ↗(On Diff #274277)

FYI: This is a significant departure from the previous patch, which attempted to put the 'smarts' in the Verifier. It seems like in order to deal properly with Mehdi's testcase where a graph region encloses an SSACFG region, without duplicating alot of the traversal, that all kinds of regions need a notion of dominance. It seems to me that the right notion of dominance for graph regions is that all values are in-scope in all blocks. so: all blocks dominate all other blocks and all values within a block dominate all other values. This also highlights the fact that CSE needs some different machinery to properly handle Graph Regions, since it currently traverses using DominatorTrees.

mlir/lib/IR/Verifier.cpp
238

I've added some tests for this. The more problematic case is the reverse:

// region without dominance
bar.non_dominance_region() {
  // region with dominance
  bar.dominance_region {
    // Should this be illegal?
    bar.use %foo_value
    bar.return
  }
  %foo_value = constant true

It's not clear to me that there is a useful interpretation for this. At the moment I've made this illegal, but it's something that could conceivably be relaxed.

stephenneuendorffer updated this revision to Diff 274283.
stephenneuendorffer marked an inline comment as done.Mon, Jun 29, 4:38 PM
stephenneuendorffer added inline comments.
mlir/lib/IR/Verifier.cpp
238

The code above is actually legal, since "%foo_value" is legal to use as an argument to bar.dominance_region

Harbormaster failed remote builds in B62245: Diff 274282!
rriddle added inline comments.Mon, Jun 29, 4:52 PM
mlir/lib/IR/Verifier.cpp
238

The fairly simple rule that I've always applied to dominance is that the parent operations act as dominance synch points. The parent operation of the definition has to dominate the ancestor of the user that is within that region.

stephenneuendorffer marked 2 inline comments as done.Mon, Jun 29, 5:39 PM
stephenneuendorffer added inline comments.
mlir/lib/IR/Verifier.cpp
238

Agreed. And I think this lifts immediately to Graph Regions: In a Graph Region, everything dominates everything else.

mehdi_amini added inline comments.Mon, Jun 29, 9:06 PM
mlir/docs/LangRef.md
440 ↗(On Diff #274283)

This defines block in a way that does not seem to me to be compatible with graph regions.

495 ↗(On Diff #274283)

You define a notion of successor here, does it generalize to graph region?

539 ↗(On Diff #274283)

in which case some other operation will execute

That does not read well to me right now.

595 ↗(On Diff #274283)

It is interesting that this ends up nested under the SSACFG region case

626 ↗(On Diff #274283)

typo: relatinoships

628 ↗(On Diff #274283)

I think it should be either with values representing streams of data. or where values represent streams of data.

656 ↗(On Diff #274283)

The story about block terminators, successors, etc. isn't clear to me with respect to graph region. Even the relationship between blocks isn't entirely clear here actually.

mlir/include/mlir/IR/Dominance.h
63 ↗(On Diff #274283)

The concept of entry block is no longer a guarantee

stephenneuendorffer marked 11 inline comments as done.
stephenneuendorffer marked an inline comment as done.Mon, Jun 29, 10:56 PM
stephenneuendorffer added inline comments.
mlir/docs/LangRef.md
495 ↗(On Diff #274283)

Currently we keep around blocks in a graph region after transforming from an SSACFG region. The blocks aren't semantically meaningful (and can be eliminated in our current model without changing the semantics), but are at least useful bookkeeping.

539 ↗(On Diff #274283)

I agree it's a little vague, but it's trying to bridge into the next section which talks about control flow and terminators. Do you have a suggestion?

595 ↗(On Diff #274283)

It's all about control flow with multiple regions. Graph regions don't have a notion of control flow.

656 ↗(On Diff #274283)

Correct, I haven't defined any semantic requirements about blocks in graph regions.

mlir/include/mlir/IR/Dominance.h
63 ↗(On Diff #274283)

There are some aspects of the interface of this class which are awkward. If regions define their own concept of dominance, then there probably needs to be an abstract interface exposing different implementations of the 'local' concept of dominance. Currently, the two implementations are somewhat awkwardly tangled together along with the hierarchy traversing code. I see that there should be some additional refactoring of this code which might push the local concept of dominance into RegionKindInterface, but I'd like to get the concepts agreed to first.

Another example is that getRootNode() has to be guarded by hasDominanceInfo(). getRootNode() isn't really appropriate for regions without an entry block.

mehdi_amini added inline comments.Fri, Jul 3, 9:18 PM
mlir/docs/LangRef.md
45 ↗(On Diff #274339)

each transformation on operations to be designed with the semantics in mind

I don't understand what this means here.

211 ↗(On Diff #274339)
Argument identifiers in mapping functions

Not sure what "mapping function" refers to here? It is in the existing doc, but this is the only place it is used. I wonder if it isn't something remaining from before functions were regular Operations?

Is this about block argument? Can you just write "Block argument" instead?

Edit: I wonder if this is about affine maps?
(we can leave this out of this revision, but we should probably revisit this)

213 ↗(On Diff #274339)

Particular operations may further limit the valid scope of identifiers

I think this is not specific enough: the scope is only limited by the region kind and I'd rather spell it out this way.
With how you wrote it, one could think that we could have an operation restricting the scope of the value it produces (for example to be block-local).

339 ↗(On Diff #274339)

It isn't clear to me why a module wouldn't be a graph region?
Do we put any meaning to the order of operations in a module?

428 ↗(On Diff #274339)

Is "commonly" intended to express that this isn't always the case?
What about spelling out the exact case: "In a SSACFG region, blocks represent ...`

435 ↗(On Diff #274339)

i.e. Branches

Shouldn't this be e.g. instead?

478 ↗(On Diff #274339)

I feel the section here isn't entirely complete with respect to non CFG region with multiple blocks: are these still having terminator? What's the contract on block arguments?

An example would be worthwhile as well.

656 ↗(On Diff #274283)

To me this is like my comment above about terminator, successor, etc.
Even if it isn't requires, let's make it explicit.

33 ↗(On Diff #270250)

In the definition of Blocks, we just wrote A *Block* is an ordered list of operations, can we just do the same here and avoid "totally ordered"?
Maybe like the following:

Operations are also organized into [Blocks](#blocks) as ordered list  (the order may or may not have a semantics important, see [Region Kinds](link)).  Blocks are themselves organized as an ordered list within their containing [Region](#regions).
stephenneuendorffer marked 18 inline comments as done.Mon, Jul 6, 9:58 AM
stephenneuendorffer added inline comments.
mlir/docs/LangRef.md
45 ↗(On Diff #274339)

I've rewritten this section.

211 ↗(On Diff #274339)

Yes, I believe this is talking about affine maps.

213 ↗(On Diff #274339)

You're right, that could be misleading. I've rewritten it.

339 ↗(On Diff #274339)

I suppose it could be, although it's somewhat of a degenerate kind of region anyway, since modules don't define any values and don't contain blocks. Maybe this should be another kind of region?

478 ↗(On Diff #274339)

I added an additional note about the Graph Region rationale, but I'm really not sure how to provide an example here of something that I don't have a good example of :). Let's chat offline about this.

stephenneuendorffer marked 4 inline comments as done.
mehdi_amini added inline comments.Mon, Jul 6, 10:01 PM
mlir/docs/LangRef.md
677 ↗(On Diff #275753)

The indentation does not seem right in this example?
The reuse of %2 does not help readability either, just like %4 before %2

stephenneuendorffer marked 2 inline comments as done.