This revision adds a tool that generates the ODS and C++ implementation for "named" Linalg ops according to the [RFC discussion](https://llvm.discourse.group/t/rfc-declarative-named-ops-in-the-linalg-dialect/745). While the mechanisms and language aspects are by no means set in stone, this revision allows connecting the pieces end-to-end from a mathematical-like specification. Some implementation details and short-term decisions taken for the purpose of bootstrapping and that are not set in stone include: 1. using a "[Tensor Comprehension](https://arxiv.org/abs/1802.04730)-inspired" syntax 2. implicit and eager discovery of dims and symbols when parsing 3. using EDSC ops to specify the computation (e.g. std_addf, std_mul_f, ...) A followup revision will connect this tool to tablegen mechanisms and allow the emission of named Linalg ops that automatically lower to various loop forms and run end to end. For the following "Tensor Comprehension-inspired" string: ``` def batch_matmul(A: f32(Batch, M, K), B: f32(K, N)) -> (C: f32(Batch, M, N)) { C(b, m, n) = std_addf<k>(std_mulf(A(b, m, k), B(k, n))); } ``` With -gen-ods-decl=1, this emits (modulo formatting): ``` def batch_matmulOp : LinalgNamedStructured_Op<"batch_matmul", [ NInputs<2>, NOutputs<1>, NamedStructuredOpTraits]> { let arguments = (ins Variadic<LinalgOperand>:$views); let results = (outs Variadic<AnyRankedTensor>:$output_tensors); let extraClassDeclaration = [{ llvm::Optional<SmallVector<StringRef, 8>> referenceIterators(); llvm::Optional<SmallVector<AffineMap, 8>> referenceIndexingMaps(); void regionBuilder(ArrayRef<BlockArgument> args); }]; let hasFolder = 1; } ``` With -gen-ods-impl, this emits (modulo formatting): ``` llvm::Optional<SmallVector<StringRef, 8>> batch_matmul::referenceIterators() { return SmallVector<StringRef, 8>{ getParallelIteratorTypeName(), getParallelIteratorTypeName(), getParallelIteratorTypeName(), getReductionIteratorTypeName() }; } llvm::Optional<SmallVector<AffineMap, 8>> batch_matmul::referenceIndexingMaps() { MLIRContext *context = getContext(); AffineExpr d0, d1, d2, d3; bindDims(context, d0, d1, d2, d3); return SmallVector<AffineMap, 8>{ AffineMap::get(4, 0, {d0, d1, d3}), AffineMap::get(4, 0, {d3, d2}), AffineMap::get(4, 0, {d0, d1, d2}) }; } void batch_matmul::regionBuilder(ArrayRef<BlockArgument> args) { using namespace edsc; using namespace intrinsics; ValueHandle _0(args[0]), _1(args[1]), _2(args[2]); ValueHandle _4 = std_mulf(_0, _1); ValueHandle _5 = std_addf(_2, _4); (linalg_yield(ValueRange{ _5 })); } ```
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
160 ms | MLIR.mlir-linalg-ods-gen::Unknown Unit Message ("") |
Event Timeline
Do you intend for this to be "approaching production quality code" and reviewed as such or still proof-of-concept level?
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
---|---|---|
24 | This actually looks like it could be reasonably parsed with a custom linalg_named_op_gen.def op? Or is there a dependency issue with using MLIR for this as this needs to happen during build time? |
@silvas I'd hope closer to "approaching production quality", it is missing comments though which will make it easier to read.
Note that almost everything above l. 1000 in mlir-linalg-ods-gen.cpp is borrowed from other places.
For some reason Token, Lexer and core Parser are kept hidden within MLIR and I would very much like to expose them and avoid the copy-pasta (@rriddle what's your take on this?).
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
---|---|---|
24 | This needs to generate ODS which in turn defines new ops. |
@silvas refactored so that things are better layereed.
Code above the following code block at line ~800 ish is taken from other places in MLIR and should be refactored out once lexer/parser is exposed.
//===----------------------------------------------------------------------===// // TC parsing. //===----------------------------------------------------------------------===//
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp | ||
---|---|---|
2 | Missing license header? |
First round of comments.
mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td | ||
---|---|---|
828 | Is there another diff that includes this? | |
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
2 | add test that exercises multiple comprehensions in the body | |
6 | layering-wise would prefer to not test this here. If needed, we can add a separate test elsewhere that does this .td -> .inc file check. Strictly speaking what ends up in the .inc file is not really the concern of this component, only the contents of the .td file. | |
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp | ||
975 | some spaces needed around first tensor-def-list | |
978 | I don't see affine-expr or tensor-typedef mentioned locally in this comment. move this to the appropriate comment? | |
1371 | should this be a diagnostic? | |
1380 | How can we emit ODS before we finish processing the whole tc-def production? | |
1411 | maybe rename to "parseAndEmitTCDef"? Also probably rename processOneComprehension to parseAndEmitOneComprehension to be consistent with that. | |
1429 | typo in the "expected" string. | |
1439–1440 | Can you make this comment a bit easier to understand. What is an "eagerly discovered symbol" and how does this "normalize" it? | |
1445–1446 | Instead of the ternary, use static AffineMap get(unsigned dimCount, unsigned symbolCount, ArrayRef<AffineExpr> results, MLIRContext *context); | |
1455 | comma separated comprehensions seems to contradict the grammar? | |
1656 | auto here obscures things IMO | |
1662 | auto here obscures things IMO |
mlir/include/mlir/Dialect/Linalg/IR/LinalgStructuredOps.td | ||
---|---|---|
828 | This is eagerly included to allow the test to pipe through mlir-tblgen and verify the ODS is well-formed (as was suggested by @ftynse, see other comment). The companion revision https://reviews.llvm.org/D76456 does the plumbing assuming a parser exists and shows how to make this run end-to-end. In the current form this is non-functional and only exists for the purpose of verifying well-formedness and avoiding a giant diff when things can be (reasonably well) separated. If you have strong objections against this interim state, I would rather drop the piping through tablegen rather than merge revisions (but @ftynse may have his own objections to this). | |
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
2 | Despite the parser accepting it, there is no support for that atm and some eperiment + design is required here. | |
6 | This was suggested by @ftynse to show the ODS is valid and how it connects to tblgen by mirroring this test: https://github.com/llvm/llvm-project/blob/master/mlir/test/mlir-tblgen/llvm-intrinsics.td#L11. I am fine either way, would just like consensus on this before reverting back to the previous state. | |
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp | ||
1380 | Thanks! | |
1455 | you're right, thanks! |
It would be great to share some parts with the main parser, for example affine expression parsing. I think we can pretty much have parseAffineExpr(StringRef) declared in a private header and use it here, possibly with some semantic post-checks on the expression not involving, e.g., SSA values.
mlir/include/mlir/IR/AffineExpr.h | ||
---|---|---|
222 | Nit: AffineExpr is a value-type, can't we just pass it by-value ? | |
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp | ||
87 | This makes it look like it's a token start... How about FIRST_KEYWORD = kw_def, LAST_KEYWORD=kw_select? | |
136 | Copy-pasta comment, this is not "operation assembly" | |
289 | Missed select keyword. We could have some macro magic to make sure modifying the list of tokens also handles them in the lexer. | |
295 | The code in getUInt64IntegerValue seems to support hex integers, but this clearly does not. | |
372 | Nit: llvm::function_ref if you don't store the argument | |
546 | Nit: the operand in consumeToken here and below is redundant, the case-expression just above ensures that the token is of the right kind. | |
841 | This should be trivial to implement without resorting to virtual functions, dispatching on kind and using static_cast. | |
850 | Nit: tensor-id is not defined | |
856 | Nit: = default would also work | |
860 | If you use LLVM-style type system, you would normally want to avoid virtual functions... | |
874 | Nit: given that PreOrder is a boolean template parameter, I am not sure what "perfoms PreOrder traversal` means when the parameter is false. Post-order? In-order? Compilation error? | |
894 | Would MutableArrayRef work instead of hardcoding SmallVector with a given size? | |
911 | Do you care about the order of reduction dimensions? | |
929 | Why SetVector? In TC, we wouldn't care about the order of reduction dimensions. | |
941–942 | And what if discovery mode != symbols ? | |
951 | Nit: something went wrong with formatting here: | ran away to the right. I personally prefer something like foo ::= token token continuation line of the same rule | another rule | |
975 | Nit: this comment repeats the comment on struct TensorExpr. I am worried about it getting out of sync if the syntax evolves. My recommendation would be to only keep the syntax in a single comment (preferably, the implementation of this method), and just refer to that from the other comments. | |
982 | Nit: why pass by-pointer rather than by-reference? |
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp | ||
---|---|---|
397 | "proper token" is unclear as error message | |
928 | Do you actually need pointers? Can't we just store Expressions as is, eventually with appropriate move semantics to avoid extra copies? | |
990 | Why does a parsing function accept an _output_ stream? | |
995 | Plz document what does it print | |
1051 | Do you actually need this? I only see DenseMap<TensorExpr *>, which should be using a generic pointer-based map implementation. | |
1054 | How about DenseMapInfo<StringRef>::getTombstoneKey() instead? | |
1067 | Same as above, AffineMap also has a DenseMapInfo if I'm not mistaken. | |
1084 | Nit: document how the visitation behaves if the callback mutates the visited object | |
1127 | Nit: would emplace_back work? | |
1136 | Have you considered storing tensors in an llvm::StringMap indexed by name instead of doing linear lookups every time? | |
1148 | Naming nit: isa is widely used for downcasting, this is just a lookup; prefer is. | |
1173 | Would emplace_back work? | |
1193 | Could you just have a default message expected %tokenname% instead of having a similar string everywhere | |
1197 | It looks like it would parse just about any id. "expected a type id" sounds a bit misleading because "type id" is not a production rule, and there's no additional check on the id somehow being a type. | |
1202 | Nit: add a description in the assertion. Also, are we sure this can never happen? | |
1281 | Ultra-nit: we tend to use single quotes rather than backticks in error messages | |
1283 | Nit: /*allowEmptyList=*/true | |
1315 | /*allowEmptyList=*/true | |
1323 | This may crash if you have less LHS declarations than RHS definitions. | |
1328 | /*allowEmptyList=*/true | |
1333 | dimCount and symbolCount make the comment look outdated, is it? | |
1353 | Did you check that indexings were different? | |
1366 | Nit: I'd use early return here | |
1374 | [Not for this commit]: I would rather have the parser accept the correct syntax, and have a separate check that implements "semantic" rules. | |
1415 | /*allowEmptyList=*/true | |
1426 | /*allowEmptyList=*/true | |
1431 | typo: "symbolicc" | |
1491 | Why is the result optional? | |
1561 | Nit: could we use more meaningful names than ss2? | |
1627 | C++14 supports auto for lambda arguments | |
1646 | Alternatively, you could use ss.str() instead of valueHandleStr below. |
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
---|---|---|
6 | I'm not sure I understand what is the concern here? The ODS check verifies the content of the produced .td file, _not_ the result of feeding that .td file to mlir-tblgen -gen-op-defs, which is indeed a separate concern. The IMPL check verifies the implementations of methods that are declared in the .td file and there is simply no other place where we can verify them. The staging here is:
@nicolasvasilache the test you referenced also has RUN lines making sure mlir-tblgen can consume what the first stage produces. Consider adding them here as well. This could help detect cases of ODS syntax change (the simple syntactic test passes, but not the piping check). That's why there is only a trivial check to make sure FileCheck eats something. |
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
---|---|---|
6 | I'm not sure I understand what is the concern here? ... Consider adding them here as well. That's precisely what the concern was IIUC, piping through mlir-tblgen (see previous snapshot that I updated improperly https://reviews.llvm.org/D77067?id=254251). Restored that part of the test. | |
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp | ||
289 | Rather than continue duplicating here, MLIR should expose the lexer and parser and everyone's life will be better. | |
295 | Yes, I am trimming liberally until MLIR exposes its lexer and parser at which point all this can disappear. | |
860 | why ? https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html#basic-setup shows it's perfectly fine to use abstract base classes and LLVM RTTI. | |
911 | you should otherwise your computation is non-deterministic | |
928 | It's this or uniqu'ing, underlying storage, placement new etc etc. | |
929 | reductions loops don't commute in FP land | |
1136 | I need 2 extra maps and really don't anticipate a single named op to ever to a point where this would matter. | |
1193 | I'm reluctant to invest more in duplicating something that should be exposed by core in a later NFC revision. | |
1374 | Agreed, there are a few other things for follwups too, thanks! | |
1431 | it's a faster symbolcc | |
1491 | this is what the ODS currently is because of manual "named ops", will be cleaned later. |
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
---|---|---|
6 | Ah, okay. Sorry for the confusion! When I saw C++ code I was assuming it was emitted by mlir-tblgen gen-op-def. But I see now that there is a mlir-linalg-tblgen -gen-impl that emits C++ as well. Sorry for the noise!!! |
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
---|---|---|
6 | Actually, when rereading I see that we do indeed invoke mlir-tblgen -gen-op-decls. I specifically object to the TBLGEN check prefixes here. I consider it a bug to do that (although i see the precendent in llvm-intrinsics.td, but I would have raised the same objection there), since it violates the layering: somebody updating mlir-tblgen shouldn't be able to break this test. Consider the implications of what is being checked now in this test... // TBLGEN-LABEL: linalg::batchmatmulOp declarations ^ could be broken by a change in a *comment* in the generated file :x // TBLGEN: class batchmatmulOpOperandAdaptor { ^ could be broken by adding a common base class to the operand adaptor classes, or a change in naming convention for the adaptor classes // TBLGEN: class batchmatmulOp : public Op< ^ could be changed by a change in base classes or naming convention. Note that none of those changes I've indicated would actually break any actual use of this code. So this test is just artificially constraining the implementation of mlir-tblgen for no real value. And even if you strip it down, all you would really be testing is def batchmatmulOp results in a class batchmatmulOp in the output, which is already tested in many places, such as, say, https://github.com/llvm/llvm-project/blob/master/mlir/test/mlir-tblgen/op-decl.td We need to be courteous to the maintainers of other components and give them the flexibility to adjust the implementations of their components. |
@nicolasvasilache any progress on reusing the MLIR parser? I consider that refactoring as blocking for submitting this patch. I don't want us to have a custom parser copypasted here that somebody has to clean up later without a strong reason.
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
---|---|---|
6 | I agree that this specific test is over-constraining for mlir-tblgen implementation. What I intended to test in intrinsicgen, and what I would like to see replicated here, is that the tablegen input produced by intrinsicgen, or my mlir-linalg-ods-gen, can be consumed by mlir-tblgen at all. Basically, we don't need to check for any output if we can find a way to check that mlir-tblgen exited with code 0 on the produced file. FileChecking the class name is just a workaround. If we don't do this check, we risk ending up in a situation where all of the existing tests pass (mlir-tblgen still generates expected C++ from ODS, and mlir-linalg-ods-gen still generates the strings expected by its test, just those strings are no longer valid ODS), but the pipeline fails. And given mlir-tblgen's tendency to assert or crash on improperly structured yet valid TableGen, it would be annoying to debug. |
meta-point: @ftynse let's not review the core parser code at the top of the file, as Nicolas says that they are just copypasta from the .mlir parser and won't be in the final patch.
@silvas I wouldn't review it if it was actual copy-pasta. It is an incomplete and modified copy, which is therefore likely to have some weird behavior or be able to get into an irreversible state where the original code wouldn't get.
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp | ||
---|---|---|
860 | Because you pay the runtime overhead price for two abstractions serving essentially the same goal. Why? | |
911 | I suppose you mean the IR you produce does not have a deterministic order of dimensions, which makes it hard to check. TC semantics says that all dimensions are interchangable, so their textual order should not matter. If it does, we should discuss the semantics and avoid branding this input as TC-like. | |
928 | I don't think vectors of unique_ptr are simpler than vectors of values. This is an extra abstraction with associated cognitive overhead. This is also one extra dynamic allocation per element, as opposed to occasional allocations in the vector, and no strong reason to maintain the pointer as unique or to auto-deallocate (other than you forced the allocation in the first place). There is no actual uniquing of expression, neither is there underlying storage or placement new, you seem to be mistaking this with how types/attributes are handled in MLIR. | |
1136 | Well, you currently have two extra vectors. I just don't see why prefer using a vector of pairs and implementing a search for _every one of them_ is better than using a dedicated container with accessor immediately available. |
@silvasean I consider that refactoring as blocking for submitting this patch. I don't want us to have a custom parser copypasted here that somebody has to clean up later without a strong reason.
I have been following precedent here, see https://reviews.llvm.org/D73405 which also introduces its own tokenizer / lexer / parser.
As far as I understand it, MLIR has been pretty opinionated about not wanting to expose its tokenizer / lexer / parser: I tried to have them exposed in the past but objections have been along the line of "it's very easy code to write anyway".
I would strongly prefer we revisit that but IMO it would be unfortunate that work is blocked on this refactoring.
Does this help mitigate your position?
Does this help mitigate your position?
Yes. I take back my request to break it out. I buy Chris' statement "I don’t think that splitting this out and pretending it is reusable is a good idea - too much of it is specific to decisions in the MLIR syntax".
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
---|---|---|
6 | Ah, ok. Then you can just remove the | FileCheck. The RUN line checks that the program has exit code 0, which won't be the case if mlir-tblgen runs into a syntax or processing error. |
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
---|---|---|
6 | Perfect, let's do this! |
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
---|---|---|
6 | Updated the test to get the minimal checkable thing: the class name. | |
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp | ||
860 | Marking as done, this is part of the more global reply on downcasting, uptr etc. | |
911 | Added a documentation section in Linalg.md. | |
928 | unique_ptr + abstract base class: basically because I use downcasting. vector<Expression> will slice unless derived classes have sizeof == 0 (i.e. there is an underlying pointer payload). An option is to implement a similar arena + pImpl to what MLIR does for the "by-value" abstractions. vector<unique_ptr<...>> is a standard and simple way to solve the slicing and ownership issue, its performance drawback are not relevant at this time IMO. | |
1136 | fair enough, done, thanks! |
Please fix the Windows build problem before landing. It looks like the pre-merge testing has such build now so you can use it for the initial check.
mlir/docs/Dialects/Linalg.md | ||
---|---|---|
473 ↗ | (On Diff #256354) | Nit: angle bracket notation |
mlir/test/mlir-linalg-ods-gen/test-linalg-ods-gen.tc | ||
6 | This won't work on Windows. Consider adding a -test-emit-additional-includes flag to mlir-linalg-ods-gen and use it here instead of trying shell magic. | |
mlir/tools/mlir-linalg-ods-gen/mlir-linalg-ods-gen.cpp | ||
823 | llvm_unreachable ? |
Is there another diff that includes this?