Currently the builtin dialect is the default namespace used for parsing
and printing. As such module and func don't need to be prefixed.
In the case of some dialects that defines new regions for their own
purpose (like SpirV modules for example), it can be beneficial to
change the default dialect in order to improve readability.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Don't use the default dialect to strip prefix for generic operations
(also update tests)
Won't this also drop the occurrences of the string that looks like a prefix from other places than operation name? Everything goes through the same stream AFAICS, so something like my.op() attributes { my.attribute = "this.may.use.the.word.builtin.because", my.builtin.enabled = 1} will get affected where it shouldn't.
mlir/include/mlir/IR/OpImplementation.h | ||
---|---|---|
969–971 ↗ | (On Diff #363386) | |
mlir/lib/IR/AsmPrinter.cpp | ||
2322–2326 | ||
2343–2346 | I can't match this comment with the code below. |
I don't think so because we only filter the beginning of the stream and nothing else. The filter is immediately clear whether it matches or not the beginning of the stream. The filter is also set immediately before printing an operation. But I'll extend the test to show this!
Address comments
mlir/lib/IR/AsmPrinter.cpp | ||
---|---|---|
2343–2346 | Ah, this is some explanation on the correctness of the code, I'll try to reformulate. |
mlir/include/mlir/IR/OpImplementation.h | ||
---|---|---|
969–971 ↗ | (On Diff #363386) | Actually this edit was incorrect: if would skip the check on the current op |
Nice!
mlir/include/mlir/IR/OpAsmInterface.td | ||
---|---|---|
68 | Can we instead just move this in the other interface? I don't think we need a special interface for this, especially given that OpAsmOpInterface is described as being a general parser/printer hook. | |
mlir/include/mlir/IR/OpImplementation.h | ||
968 ↗ | (On Diff #363584) | Can we move this to AsmPrinter.cpp? I don't see any uses outside of there, or is there some reason to have this publicly exposable? |
971 ↗ | (On Diff #363584) | Do we want it to be inherited? I would almost prefer that the default only apply to the immediate regions. It seems less surprising that way, and more contextualized. For example assuming we inherit default names, if I split out a nested region op from a reduced test case, I might need to re-add operation namespaces that were filtered out from the parent (as the parent may now be different). |
mlir/lib/IR/AsmPrinter.cpp | ||
2310 | We discussed a little offline, but I think it would be better to just remove the constraint that custom op printers print the operation name at the beginning. I don't see a real reason for them to do that, and it would also match the behavior of the parser (which doesn't parse the operation name, for likely obvious reasons). | |
2433 | Do we need this to be std::string? I would assume that we could enforce StringRef here (getDefaultDialect also returns StringRef). | |
mlir/lib/Parser/ParserState.h | ||
91 | Same comment here, can we use StringRef instead of std::string? |
mlir/include/mlir/IR/OpAsmInterface.td | ||
---|---|---|
83 | Do we want to allow multiple default namespaces? Some projects might have several that all kind of co-mingle with each other. | |
mlir/include/mlir/IR/OpImplementation.h | ||
973 ↗ | (On Diff #363584) | Do we need to default to builtin? Can we instead default to nothing? |
mlir/lib/IR/AsmPrinter.cpp | ||
2433 | I think I missed it, but how is this parser stack different from the one in the ParserState? |
mlir/include/mlir/IR/OpAsmInterface.td | ||
---|---|---|
83 | I can't imagine what it'd look like right now: we'd have to guarantee these dialects don't have any pair of op with the same name to avoid conflicts. | |
mlir/include/mlir/IR/OpImplementation.h | ||
971 ↗ | (On Diff #363584) | Yeah I hesitated on this, we can start with a shallow model conservatively |
973 ↗ | (On Diff #363584) | That mean that we wouldn't be able to parse a top-level module right now though, do you want to change this now? (I'd be fine with this but I would land this in two revision though) |
mlir/lib/IR/AsmPrinter.cpp | ||
2310 | Yeah I looked into it, one wrinkle though is that something like Dialect::printOperation can fail and we fallback to the generic operation printer. | |
2433 | This is a printer stack, and the other is the parser one? They are equivalent otherwise. |
mlir/include/mlir/IR/OpImplementation.h | ||
---|---|---|
973 ↗ | (On Diff #363584) | I think it's reasonable/fine to require the "builtin" namespace on the top-level module. I'm also okay with it being separate from this commit. |
mlir/lib/IR/AsmPrinter.cpp | ||
2310 | Hmmm. Can we change Dialect::printOperation to act more like getParseOperationHook? That way we would know if it would succeed before calling it. | |
2433 | Right right, missed that one was printer/one was parser. | |
mlir/lib/Parser/Parser.cpp | ||
1846 | Would it be better to keep StringRef here, but have another variable that is OperationName? That would remove the need to build a std::string, and the OperationName var could be passed in later when building the OperationState. | |
1869–1875 | Shouldn't this one be .str()? |
mlir/include/mlir/IR/OpAsmInterface.td | ||
---|---|---|
56 | under a given operation -> under this operation (cf. doc of getAsmResultNames above to align.) | |
56 | dialect -> dialect prefix | |
57 | Nit: Punctuation broken here. | |
58–59 | if it was implementing this interface to return ... -> if this method returned spv. ^^ "it" is dangling here. | |
mlir/lib/IR/AsmPrinter.cpp | ||
2316–2317 | Nit: reflow to use width. | |
2361 | Camel case? | |
2429 | get -> gets / is |
purpose (like SpirV modules for example), it can be beneficial to
change the default dialect in order to improve readability.
Does this really improve readability? Doesn't it create ambiguity for a reader as to which dialect's operation is being referring to (when you have the same names)? I might have misunderstood it but that's what the commit summary conveys to me.
Does this really improve readability? Doesn't it create ambiguity for a reader as to which dialect's operation is being referring to (when you have the same names)? I might have misunderstood it but that's what the commit summary conveys to me.
Yes you’re right. I think this is about how it gets used though.
I’d claim that it is less intrusive than the custom assembly that we allow (we actually had internal teams complaining about it: affine, linalg, scf, … all have different custom conventions).
If you remember the original affine functions, I don’t think operations were prefixed: it was later during generalization that we added the prefix. The prefix wasn’t for readability but for necessary disambiguating in the parser.
The idea here is that you can see it like an extension of custom assembly: instead of just controlling the immediate operation you can impact the immediately nested regions in a limited way.
My main motivation is the TensorFlow graph dialect right now: when I write tfg.graph { I’d like to avoid prefixing every single operation with tfg. inside the graph itself.
This isn’t a region that allows anything else anyway, but even if it did, the few outlier operation would be prefixed.
mlir/include/mlir/IR/OpAsmInterface.td | ||
---|---|---|
57 | I can't spot what you refer to here? | |
mlir/lib/IR/AsmPrinter.cpp | ||
2310 | (I'll look into this later) | |
2361 | I am overriding the parent class from LLVM, can't do? | |
mlir/lib/Parser/Parser.cpp | ||
1846 | I tried but it's convoluted: OperationName does not have a default constructor. |
It did also disambiguate for the user though I'd say. Less of an issue with smaller number of dialects and all kept in head (we've also never had two dialects with i32 type for example, but if one saw i32 type one would not consider if from dialect or not)
The idea here is that you can see it like an extension of custom assembly: instead of just controlling the immediate operation you can impact the immediately nested regions in a limited way.
I think that is the question. This is like having a using namespace statement implicitly and attached to a region (well I think all regions of dialect specified ops?). So when looking at a dump a reader has to know which disambiguating rules to follow where implicitly (there is no visual cue). I'm assuming when an error is emitted we don't strip prefix (so that error is unambiguous without additional context).
And then Rivers question also come in to play: what if I have 3 dialects that I use equally in a given op's "first layer of ops" (as even TFG I believe allows nesting and ops inside the nested region may not be TFG), now do I have to decide based on preferred one.
I do agree that folks have complained about custom syntax being inconsistent, but that doesn't seem like a good argument for why this is good.
I do agree that folks have complained about custom syntax being inconsistent, but that doesn't seem like a good argument for why this is good.
Absolutely: the argument I made is that it is consistent with how we approached it in MLIR so far ; when the generic printer isn’t used you get an output that is readable optimally for folks familiar with a given dialect. And also that we provide this power to dialect authors, and trust them to use it responsibly.
(as even TFG I believe allows nesting and ops inside the nested region may not be TFG),
Nesting further under TFG shouldn’t be an issue, it would be a different region which may or may not use the same default (or none).
now do I have to decide based on preferred one.
Yes, or none: if three dialects are equally used in a region, why strip any of them? The fact that we can strip does not mean we should!
Other than Spirv that already restricts that only spirv ops are present, I don’t see any other place in-tree where we would use this feature right now.
Then, the ambiguity that this adds due to the prefix being dropped based on the context appears really undesirable to me at first sight. It's more readable with the prefix for me. Of course, you are providing an option for a dialect to elide things, but this whole context sensitive printing (esp. for large regions) in the face of collisions can be quite painful. This also adds a layer of non-uniformity and surprises for people working across dialects and when folks encounter a new dialect. MLIR being an IR and not a user-facing programming language, I'm not sure you gain much by dropping a prefix like tfg to your ops.
Could you discuss this change on discourse? (Sorry if you already did and I missed it.) I feel we should apply brakes in this case on context sensitive elision and I'm a -1 here.
I think you'll have to elaborate with examples, because I don't see it really.
This also adds a layer of non-uniformity and surprises for people working across dialects and when folks encounter a new dialect. MLIR being an IR and not a user-facing programming language, I'm not sure you gain much by dropping a prefix like tfg to your ops.
The exact same argument applies to the custom assembly used by most dialect I believe.
Could you discuss this change on discourse? (Sorry if you already did and I missed it.)
Will do!
mlir/test/lib/Dialect/Test/TestOps.td | ||
---|---|---|
637–638 | Comment is copy-pasta :-) |
mlir/lib/IR/AsmPrinter.cpp | ||
---|---|---|
2310 |
I agree completely, the stream approach here is too-clever. It is also a pain (and pointless) to print the op name anyway, it would be far better to have the asmprinter do this by default. |
Rebase on top of D108804 to remove the "filtering stream" now that the framework has direct control on printing the op name
mlir/include/mlir/IR/OpAsmInterface.td | ||
---|---|---|
56 | ||
58 | ||
59–61 | I don't think last statement is up-to-date. An empty string means no elision right? not that it is inherited. | |
mlir/lib/IR/AsmPrinter.cpp | ||
2428 | ||
2671–2678 | You could also use llvm::make_scope_exit instead. | |
2682 | Shouldn't we push an empty string here instead? | |
mlir/lib/IR/Operation.cpp | ||
651 | Can we add a TODO/FIXME to kill this line? | |
mlir/lib/Parser/Parser.cpp | ||
1892 | Same comment here: why do we use builtin as the fallback here? I would expect an empty default dialect to mean "no ellision". | |
1903–1910 | You can use llvm::make_scope_exit here instead. |
mlir/lib/IR/AsmPrinter.cpp | ||
---|---|---|
2671–2678 | Grahh: believe me I searched for this utility and couldn't find it and concluded that my memory of its existence was bogus... |
mlir/include/mlir/IR/OpImplementation.h | ||
---|---|---|
973 ↗ | (On Diff #363584) | FYI, for future reference, this change was disruptive for us -- we had already updated tests to use builtin.func (in CHECK lines, etc.) or written new tests that were natively checking for builtin.func. After this change, we had to change it back. |
mlir/include/mlir/IR/OpImplementation.h | ||
---|---|---|
973 ↗ | (On Diff #363584) | (not sure if it was exactly this change, but something caused builtin.func to print as just func since our last integrate.) |
mlir/include/mlir/IR/OpImplementation.h | ||
---|---|---|
973 ↗ | (On Diff #363584) | This patch is the culprit indeed. Before: you could parse without the builtin prefix but it would get printed, while now it'll be omitted if it can be parsed basically. |
mlir/include/mlir/IR/OpImplementation.h | ||
---|---|---|
973 ↗ | (On Diff #363584) | I think most tests are written by generate-test-checks.py which isn't smart enough to know that ;) |