This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Add RegionKindInterface
ClosedPublic

Authored by stephenneuendorffer on May 20 2020, 10:31 PM.

Details

Summary

Some dialects have semantics which is not well represented by common
SSA structures with dominance constraints. This patch allows
operations to declare the 'kind' of their contained regions.
Currently, two kinds are allowed: "SSACFG" and "Graph". The only
difference between them at the moment is that SSACFG regions are
required to have dominance, while Graph regions are not required to
have dominance. The intention is that this Interface would be
generated by ODS for existing operations, although this has not yet
been implemented. Presumably, if someone were interested in code
generation, we might also have a "CFG" dialect, which defines control
flow, but does not require SSA.

The new behavior is mostly identical to the previous behavior, since
registered operations without a RegionKindInterface are assumed to
contain SSACFG regions. However, the behavior has changed for
unregistered operations. Previously, these were checked for
dominance, however the new behavior allows dominance violations, in
order to allow the processing of unregistered dialects with Graph
regions. One implication of this is that regions in unregistered
operations with more than one op are no longer CSE'd (since it
requires dominance info).

I've also reorganized the LangRef documentation to remove assertions
about "sequential execution", "SSA Values", and "Dominance". Instead,
the core IR is simply "ordered" (i.e. totally ordered) and consists of
"Values". I've also clarified some things about how control flow
passes between blocks in an SSACFG region. Control Flow must enter a
region at the entry block and follow terminator operation successors
or be returned to the containing op. Graph regions do not define a
notion of control flow.

see discussion here:
https://llvm.discourse.group/t/rfc-allowing-dialects-to-relax-the-ssa-dominance-condition/833/53

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
rriddle added inline comments.Jun 16 2020, 12:40 AM
mlir/docs/LangRef.md
65

nit: I find this definition confusing given that arguments to blocks are also values.

76

I feel the same way as Mehdi. I would rephrase to something like:

Operations can be arbitrarily extended in MLIR, and may represent many different concepts ...
...

I would even consider removing the bit about verification. Feels a bit out of place here.

218–233

nit: are only in scope

220

This isn't true. This goes into symbol table semantics, which scopes the visibility with its own set of rules. https://mlir.llvm.org/docs/SymbolsAndSymbolTables/

434

Can you reword this? It gives off the vibe that the terminator is actually generating the value, but terminator operations don't generally have results. Some terminator operations may produce the PHI value internally, but those are not very common. I would rephrase to "provided" or something similar.

551

nit: Can you mention IsolatedFromAbove here and link to the doc on it?

629

However, the particular semantics of operations are
unspecified and completely specify the behavior of an operation and
inside the regions of an operation.

Can you rephrase this? I'm having trouble parsing it.

632

nit: Link to the documentation for this.

634

I don't agree on the principal here. We only have to be conservative if it "could" be something that has valid semantics when registered. For example, we have to be conservative about unregistered operations that "could" be symbol tables, but that doesn't mean we error out if we see any unregistered region holding operation. Being unregistered IMO is just having the capability of being anything legal. If something isn't legal when registered, why should we guard against it when unregistered?

639

nit: the potential successor blocks

641

Correct. There is no modeling ATM for indirect branches where you have no idea what the potential destination could be.

647

nit: are treated as unreachable and may be

671

nit: Regions -> regions

mlir/include/mlir/IR/RegionKindInterface.h
2

This line looks like it is one character to short.

15

Please fix the header guard style.

18

Can you trim these?

27

Please add documentation here on what these are and what they mean.

30

Graph? SSACFG contains acronyms so I see why they are capitalized.

mlir/include/mlir/IR/RegionKindInterface.td
28

nit: Can you reflow this and move it to the next line? The doc formatter can handle this automatically.

28

Missing period at the end of the summary.

31

nit: Remove double space.

40

nit: Please use multi line string for this description.

48

Leftover debugging?

56

Please remove.

61

Please fix the header guard style.

mlir/include/mlir/Interfaces/CMakeLists.txt
7 ↗(On Diff #270965)

Seems unrelated.

mlir/lib/IR/RegionKindInterface.cpp
19

I don't think this is necessary.

mlir/lib/IR/Verifier.cpp
31

I don't think this is necessary.

This revision now requires changes to proceed.Jun 16 2020, 12:40 AM
rriddle added inline comments.Jun 16 2020, 12:42 AM
mlir/test/lib/Dialect/Test/TestDialect.cpp
263

Please remove the mlir:: from each of these.

stephenneuendorffer marked 49 inline comments as done.Jun 16 2020, 7:10 PM

Update after review changes

mlir/docs/Interfaces.md
232

Certainly! I figure writing a few by hand would lead us to that.

mlir/docs/LangRef.md
68

Yes, exactly. The point of this is that the order of instructions in the IR does not need to correspond to the execution order of instructions in a sequential machine. Even in a graph, the IR still maintains an order, which enables deterministic round-tripping, for instance.

76

What is an "Operation Definition" other than traits, interfaces, and verification constraints? operand and result names? successor names? I've reworded this section.

501–508

How so? The point of this is to show the structures that are important and that are the requirements on those structures, so I'm trying to define a region structurally separate from the CFG semantics. What would you add to this, other than "it's a CFG?"

634

@rriddle I don't understand whether you are disagreeing with Mehdi or me.

I think that if ops are unregistered we should be pretty liberal about what should be allowed. Structually, the only requirement I've imposed on regions of unregistered ops is that all value identifiers must be defined. In particular, I have not required that regions of unregistered operations must be either 1) a single block or 2) a valid SSACFG region with dominance. (I think this was suggested in the discussion)

639

we don't call them "potential successors' anywhere else. I think the meaning is clear from the following sentence

641

I was also thinking about something co-routine like where Blocks could yield to another block chosen by the containing operation.

stephenneuendorffer marked 3 inline comments as done.
rriddle marked an inline comment as done.Jun 16 2020, 7:27 PM
rriddle added inline comments.
mlir/docs/LangRef.md
641

You could also implement coroutines with an explicit switch in the entry block that contains each of the potential resume points.

stephenneuendorffer retitled this revision from [MLIR][InProgress] Add RegionKindInterface to [MLIR] Add RegionKindInterface.
stephenneuendorffer edited the summary of this revision. (Show Details)
rriddle added inline comments.Jun 17 2020, 5:29 PM
mlir/lib/IR/Verifier.cpp
267

I don't think this is correct. You are missing checks for how dominance interacts across different regions. Even if there is no intra-region dominance, there are still constraints how values may be used across regions. For example, the operand values have to be defined in this region or an ancestor region. This is also missing checks on how verification happens when the value is defined in a region with dominance and used in a region without. The dominance relationship between the ancestor operation in the region with dominance must still be checked, i.e.,

// region with dominance
func @foo() {
  // region without dominance
  bar.non_dominance_region {
    // Verifier failure: This is a non-dominating use.
    bar.use %foo_value
    bar.return
  }
  %foo_value = constant true
mehdi_amini added inline comments.Jun 17 2020, 8:14 PM
mlir/docs/LangRef.md
68

Yes, exactly

I was pointing out something that seems contradictory, so I'm puzzled by your answer here?

Even in a graph, the IR still maintains an order, which enables deterministic round-tripping, for instance.

Yes but there is still no total order since you have cycles.

The point of this is that the order of instructions in the IR does not need to correspond to the execution order of instructions in a sequential machine.

OK but the way it is written is still confusing to me. You're also using "Region" as if it is a well defined term while it is the first use in the doc.
I'll look for an alternative writing after we settle on the description of multi-blocks region below.

Operations are also organized into [Blocks](#blocks) and are totally ordered
within their containing [Region](#regions)

634

In particular, I have not required that regions of unregistered operations must be either 1) a single block or 2) a valid SSACFG region with dominance. (I think this was suggested in the discussion)

This complicates the design, and I don't believe your specification is complete right now with respect to non-CFG multi-blocks region.

The paragraph that follows is a good example, when it says that when control flow enters a region, it always begins in the first block of the region, called the *entry* block we already have a notion of control for the blocks. Then Control flow can only pass to one of the specified successor blocks as in a branch operation, or back to the containing operation as in a return operation. Terminator operations without successors can only pass.
In your example above:

"unregistered_op" () ({
^bb1:
  %2 = %1
  %1 = %2
^bb2:
  %3 = %2
}

The second block is clearly unreachable and can just be deleted.

It also means that:

"unregistered_op" () ({
^bb1:
  %2 = %3
  %1 = %2
^bb2:
  %3 = ...
}

Would be invalid IR.

stephenneuendorffer marked 3 inline comments as done.Jun 18 2020, 8:36 AM
stephenneuendorffer added inline comments.
mlir/docs/LangRef.md
68

OK.. I see there being several relationships here. One is the total order of operations in a block and blocks in a region. There is a separate relationship which is the data dependence relationship, which I don't think is actually a partial order on operations in Graph regions or SSACFG regions. Certainly not if terminator edges are included.

634

My intention is that the "Control Flow" section here only applies to SSACFG regions. Maybe there needs to be a separate section on Graph regions. Your example illustrates why 'reachability' of basic blocks is probably not a great concept in graph regions.

mlir/lib/IR/Verifier.cpp
267

Right... I think some more test cases are in order here.

mehdi_amini added inline comments.Jun 19 2020, 1:20 AM
mlir/docs/LangRef.md
501–508

I see the structure as trivial: only the semantics is complex (i.e. reachability, control-flow, etc.).

So I'm not sure what's the point of long description of what could be shown with a few lines of grammar specification (or even 10 lines of C++).

bollu added a subscriber: bollu.Jun 24 2020, 11:24 AM
stephenneuendorffer edited the summary of this revision. (Show Details)
stephenneuendorffer marked 8 inline comments as done.
mlir/docs/LangRef.md
501–508

I think the structure is actually harder to grok than you might think for outsiders. Also, I think the syntactic issues of how it is represented as a grammar doesn't necessarily reflect what you need to know to use it from c++. (In fact, I've been considering if these need to be separated further, having a separate page for an abstract description and leave most of what is currently in LangRef.md as a description of the concrete grammar syntax.)

I think one of the things that is missing right now is that the semantics is quite flexible, but not without bounds. Also the syntax is quite flexible, but also not without bounds. I see this as a step towards being explicit about those bounds, at least on the semantic side.

mlir/lib/IR/Dominance.cpp
45–67

FYI: This is a significant departure from the previous patch, which attempted to put the 'smarts' in the Verifier. It seems like in order to deal properly with Mehdi's testcase where a graph region encloses an SSACFG region, without duplicating alot of the traversal, that all kinds of regions need a notion of dominance. It seems to me that the right notion of dominance for graph regions is that all values are in-scope in all blocks. so: all blocks dominate all other blocks and all values within a block dominate all other values. This also highlights the fact that CSE needs some different machinery to properly handle Graph Regions, since it currently traverses using DominatorTrees.

mlir/lib/IR/Verifier.cpp
267

I've added some tests for this. The more problematic case is the reverse:

// region without dominance
bar.non_dominance_region() {
  // region with dominance
  bar.dominance_region {
    // Should this be illegal?
    bar.use %foo_value
    bar.return
  }
  %foo_value = constant true

It's not clear to me that there is a useful interpretation for this. At the moment I've made this illegal, but it's something that could conceivably be relaxed.

stephenneuendorffer updated this revision to Diff 274283.
stephenneuendorffer marked an inline comment as done.Jun 29 2020, 4:38 PM
stephenneuendorffer added inline comments.
mlir/lib/IR/Verifier.cpp
267

The code above is actually legal, since "%foo_value" is legal to use as an argument to bar.dominance_region

Harbormaster failed remote builds in B62245: Diff 274282!
rriddle added inline comments.Jun 29 2020, 4:52 PM
mlir/lib/IR/Verifier.cpp
267

The fairly simple rule that I've always applied to dominance is that the parent operations act as dominance synch points. The parent operation of the definition has to dominate the ancestor of the user that is within that region.

stephenneuendorffer marked 2 inline comments as done.Jun 29 2020, 5:39 PM
stephenneuendorffer added inline comments.
mlir/lib/IR/Verifier.cpp
267

Agreed. And I think this lifts immediately to Graph Regions: In a Graph Region, everything dominates everything else.

mehdi_amini added inline comments.Jun 29 2020, 9:06 PM
mlir/docs/LangRef.md
440

This defines block in a way that does not seem to me to be compatible with graph regions.

513

You define a notion of successor here, does it generalize to graph region?

575

in which case some other operation will execute

That does not read well to me right now.

624–636

It is interesting that this ends up nested under the SSACFG region case

654

typo: relatinoships

656

I think it should be either with values representing streams of data. or where values represent streams of data.

684

The story about block terminators, successors, etc. isn't clear to me with respect to graph region. Even the relationship between blocks isn't entirely clear here actually.

mlir/include/mlir/IR/Dominance.h
63

The concept of entry block is no longer a guarantee

stephenneuendorffer marked 11 inline comments as done.
stephenneuendorffer marked an inline comment as done.Jun 29 2020, 10:56 PM
stephenneuendorffer added inline comments.
mlir/docs/LangRef.md
513

Currently we keep around blocks in a graph region after transforming from an SSACFG region. The blocks aren't semantically meaningful (and can be eliminated in our current model without changing the semantics), but are at least useful bookkeeping.

575

I agree it's a little vague, but it's trying to bridge into the next section which talks about control flow and terminators. Do you have a suggestion?

624–636

It's all about control flow with multiple regions. Graph regions don't have a notion of control flow.

684

Correct, I haven't defined any semantic requirements about blocks in graph regions.

mlir/include/mlir/IR/Dominance.h
63

There are some aspects of the interface of this class which are awkward. If regions define their own concept of dominance, then there probably needs to be an abstract interface exposing different implementations of the 'local' concept of dominance. Currently, the two implementations are somewhat awkwardly tangled together along with the hierarchy traversing code. I see that there should be some additional refactoring of this code which might push the local concept of dominance into RegionKindInterface, but I'd like to get the concepts agreed to first.

Another example is that getRootNode() has to be guarded by hasDominanceInfo(). getRootNode() isn't really appropriate for regions without an entry block.

mehdi_amini added inline comments.Jul 3 2020, 9:18 PM
mlir/docs/LangRef.md
68

In the definition of Blocks, we just wrote A *Block* is an ordered list of operations, can we just do the same here and avoid "totally ordered"?
Maybe like the following:

Operations are also organized into [Blocks](#blocks) as ordered list  (the order may or may not have a semantics important, see [Region Kinds](link)).  Blocks are themselves organized as an ordered list within their containing [Region](#regions).
80

each transformation on operations to be designed with the semantics in mind

I don't understand what this means here.

219

Argument identifiers in mapping functions

Not sure what "mapping function" refers to here? It is in the existing doc, but this is the only place it is used. I wonder if it isn't something remaining from before functions were regular Operations?

Is this about block argument? Can you just write "Block argument" instead?

Edit: I wonder if this is about affine maps?
(we can leave this out of this revision, but we should probably revisit this)

221

Particular operations may further limit the valid scope of identifiers

I think this is not specific enough: the scope is only limited by the region kind and I'd rather spell it out this way.
With how you wrote it, one could think that we could have an operation restricting the scope of the value it produces (for example to be block-local).

351–357

It isn't clear to me why a module wouldn't be a graph region?
Do we put any meaning to the order of operations in a module?

434

Is "commonly" intended to express that this isn't always the case?
What about spelling out the exact case: "In a SSACFG region, blocks represent ...`

435

i.e. Branches

Shouldn't this be e.g. instead?

496

I feel the section here isn't entirely complete with respect to non CFG region with multiple blocks: are these still having terminator? What's the contract on block arguments?

An example would be worthwhile as well.

mehdi_amini added inline comments.Jul 3 2020, 9:18 PM
mlir/docs/LangRef.md
684

To me this is like my comment above about terminator, successor, etc.
Even if it isn't requires, let's make it explicit.

stephenneuendorffer marked 18 inline comments as done.Jul 6 2020, 9:58 AM
stephenneuendorffer added inline comments.
mlir/docs/LangRef.md
80

I've rewritten this section.

219

Yes, I believe this is talking about affine maps.

221

You're right, that could be misleading. I've rewritten it.

351–357

I suppose it could be, although it's somewhat of a degenerate kind of region anyway, since modules don't define any values and don't contain blocks. Maybe this should be another kind of region?

496

I added an additional note about the Graph Region rationale, but I'm really not sure how to provide an example here of something that I don't have a good example of :). Let's chat offline about this.

stephenneuendorffer marked 4 inline comments as done.
mehdi_amini added inline comments.Jul 6 2020, 10:01 PM
mlir/docs/LangRef.md
683

The indentation does not seem right in this example?
The reuse of %2 does not help readability either, just like %4 before %2

stephenneuendorffer marked 2 inline comments as done.
rriddle accepted this revision.Jul 13 2020, 7:02 PM
rriddle marked an inline comment as done.

Thanks for all of the work on this! Mostly nits, so I think this looks good enough to land for now after resolving. We can iterate on the other necessary aspects afterwards.

mlir/docs/Interfaces.md
206

nit: Drop the newline here.

232

ODS can already generate the documentation. I'll take a look at hooking it into here after this lands.

https://github.com/llvm/llvm-project/blob/bf74c3838904b2df2a0f31bc457af1bdb811953f/mlir/tools/mlir-tblgen/OpInterfacesGen.cpp#L414

mlir/docs/LangRef.md
64

nit: *Operations* to match the *Values* later.

65

nit: I would clarify Argument ->Block Argument here.

68

nit: Regions to match the tense of the others.

220

nit: You may want to clarify here that nested regions count as being within the parent region.

351–357

Should a module be an SSA CFG region? It could very well be a graph region.

(This is not meant to block anything, but something to consider for the future).

356

nit: Could you please you link symbol name to the symbol doc again here?

434

typo: represent represents

434

Seems like this only applies to non-entry blocks, as the entry block(s) may have values provided the parent operation.

513

nit: Can you remove the double space before The.

548

nit; Same here, seems to be a few double space sentences in this paragraph.

601

nit: Remove the indent of this code snippet.

613

nitL Double space here.

627

nit: Double space here.

640

typo: ot.

660

nit: Can you wrap this in a Context or Rationale section? I think there are other examples in this doc.

672

nit: I would likely mention that terminator operations cannot be re-ordered.

687

Remove this now?

mlir/include/mlir/IR/Dominance.h
71–73

nit: Can you remove the double space sentences in this file?

87

nit: B, i.e.

94–95

Same here and below: B, i.e.

mlir/include/mlir/IR/RegionKindInterface.h
19

I don't think this one is necessary.

23

nit: Please use /// here.

mlir/include/mlir/IR/RegionKindInterface.td
23

There is a good doc for this in Interfaces.md, can you paste that here?

25

nit: You should be able to set the namespace(via cppNamespace) now.

31

Unresolved?

41

nit: the strings are slightly misaligned here.

mlir/lib/IR/Dominance.cpp
46

nit: Cache the end iterator of this loop.

47

nit: Spell out auto here.

240

nit: Drop else after return.

262

nit: Please drop the uses of mlir:: in this file.

264

nit: Can you define a static helper for this in this file?

static bool hasSSADominance(Operation *op, RegionKindInterface interface, unsigned regionNum) {
    return ancestor->isRegistered() &&
          (!kindInterface || kindInterface.hasSSADominance(aRegionNum));
}
308

Same here, drop else after return.

mlir/lib/IR/Verifier.cpp
187–205

nit: Cache the end iterator here.

188

nit: Spell out auto here.

202

nit: expected graph region?

280

nit: Cache the end iterator here, and can you inline the region variable below into its use?

282

nit: Spell out auto here.

mlir/lib/Transforms/CSE.cpp
174

Can you reword the second part of this into a TODO:?

175

nit: Drop the double space here.

177

nit: Drop trivial braces.

mlir/test/IR/parser.mlir
1249–1262

Can you move the CHECK lines to non-IR lines to match the other tests? Ends up being much cleaner IMO.

mlir/test/IR/traits.mlir
406

nit: Please remove the CHECK lines that don't relate to what you are checking. This applies to the other tests as well. e.g. You don't really need to check return or the braces here.

415

nit: This comment seems wrong.

428

nit: Can you indent these expected lines? Same below.

442

nit: Please remove the explicit SSA value names, and please remove the unnecessary bits of this test.

505

nit: Indent the graph region.

mlir/test/lib/Dialect/Test/TestDialect.cpp
263

Unresolved here?

mlir/test/lib/Dialect/Test/TestOps.td
1196

nit: You could use a "" for the descriptions as well.

stephenneuendorffer marked 56 inline comments as done.Jul 14 2020, 3:10 PM

If there are no further comments, I'd like to commit this. @bondhugula @mehdi_amini Do you want to accept?

Note that there's a small amount of new text under "### Regions" where I tried to capture the understanding about how Region Arguments are shared with the Block Arguments of the entry block, but that operation arguments/return values are explicitly decoupled from region arguments and terminators. Also terminator arguments do not need to match the block arguments of their successor blocks. All of this is under the control of specific operations and may be arbitrarily defined.

mlir/docs/LangRef.md
647

I've reworded this.

683

Actually, there were a few other minor problems with this testcase, including the fact that it missed out on the fact that:

%1 = op(%1)

should be valid, along with:

%1 = op() { otherop(%1) }

I also turned this into another testcase.

stephenneuendorffer edited the summary of this revision. (Show Details)
stephenneuendorffer marked an inline comment as done.
bondhugula added inline comments.Jul 15 2020, 11:27 AM
mlir/include/mlir/IR/Dominance.h
86–92

Reflow: I think you aren't using the whole width.

mlir/lib/IR/Dominance.cpp
27

These should be ///

43

Nit: This should be //.

262–264

Nit: reflow to use width.

mlir/lib/IR/Verifier.cpp
66

Nit: terminate comment with period.

Quick question: how do we know in printed IR if something is a graph region or an SSA one?

mlir/docs/LangRef.md
65

a Block Argument?

77–107

Nit: Reflow to use width.

stephenneuendorffer marked 8 inline comments as done.Jul 15 2020, 12:40 PM

Quick question: how do we know in printed IR if something is a graph region or an SSA one?

You don't. Syntactically, there is no distinction for unregistered ops. Unregistered ops are handled conservatively as graph regions, so they are allowed to violate SSA dominance. Passes that require dominance-order traversal (i.e. CSE) won't operate on regions inside unregistered ops.

mlir/docs/LangRef.md
65

I parse it as 'exactly one (Operation or Block Argument)', so I think it makes sense as is?

stephenneuendorffer marked an inline comment as done.
This revision was not accepted when it landed; it landed in state Needs Review.Jul 15 2020, 2:32 PM
This revision was automatically updated to reflect the committed changes.