diff --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md --- a/mlir/docs/Canonicalization.md +++ b/mlir/docs/Canonicalization.md @@ -200,11 +200,11 @@ value, generally returned by `fold`, and produces a "constant-like" operation that materializes that value. -In [ODS](OpDefinitions.md), a dialect can set the `hasConstantMaterializer` bit +In [ODS](DefiningDialects.md), a dialect can set the `hasConstantMaterializer` bit to generate a declaration for the `materializeConstant` method. ```tablegen -def MyDialect_Dialect : ... { +def MyDialect : ... { let hasConstantMaterializer = 1; } ``` diff --git a/mlir/docs/DefiningDialects.md b/mlir/docs/DefiningDialects.md new file mode 100644 --- /dev/null +++ b/mlir/docs/DefiningDialects.md @@ -0,0 +1,310 @@ +# Defining Dialects + +This document describes how to define [Dialects](LangRef.md/#dialects). + +[TOC] + +## LangRef Refresher + +Before diving into how to define these constructs, below is a quick refresher +from the [MLIR LangRef](LangRef.md). + +Dialects are the mechanism by which to engage with and extend the MLIR +ecosystem. They allow for defining new [attributes](LangRef.md#attributes), +[operations](LangRef.md#operations), and [types](LangRef.md#type-system). +Dialects are used to model a variety of different abstractions; from traditional +[arithmetic](Dialects/ArithmeticOps.md) to +[pattern rewrites](Dialects/PDLOps.md); and is one of the most fundamental +aspects of MLIR. + +## Defining a Dialect + +At the most fundamental level, defining a dialect in MLIR is as simple as +specializing the +[C++ `Dialect` class](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/Dialect.h). +That being said, MLIR provides a powerful declaratively specification mechanism via +[TableGen](https://llvm.org/docs/TableGen/index.html); a generic language with +tooling to maintain records of domain-specific information; that simplifies the +definition process by automatically generating all of the necessary boilerplate +C++ code, and also provides additional powerful tools on top (such as +documentation generation). Given the above, the declarative specification is the +expected mechanism for defining new dialects, and is the method detailed within +this document. Before continuing, it is highly recommended that users review the +[TableGen Programmer's Reference](https://llvm.org/docs/TableGen/ProgRef.html) +for an introduction to its syntax and constructs. + +Below showcases an example simple Dialect definition. We generally recommend defining +the Dialect class in a different `.td` file from the attributes, operations, types, +and other sub-components of the dialect to establish a proper layering between +the various different dialect components. This recommendation extends to all of +the MLIR constructs, including [Interfaces](Interfaces.md) for example. + +```tablegen +// Include the definition of the necessary tablegen constructs for defining +// our dialect. +include "mlir/IR/DialectBase.td" + +// Here is a simple definition of a dialect. +def MyDialect : Dialect { + let summary = "A short one line description of my dialect."; + let description = [{ + My dialect is a very important dialect. This section contains a much more + detailed description that documents all of the important pieces of information + to know about the document. + }]; + + /// This is the namespace of the dialect. It is used to encapsulate the sub-components + /// of the dialect, such as operations ("my_dialect.foo"). + let name = "my_dialect"; + + /// The C++ namespace that the dialect, and its sub-components, get placed in. + let cppNamespace = "::my_dialect"; +} +``` + +The above showcases a very simple description of a dialect, but dialects have lots +of other capabilities that you may or may not need to utilize. + +### Documentation + +The `summary` and `description` fields allow for providing user documentation +for the dialect. The `summary` field expects a simple single-line string, with the +`description` field used for long and extensive documentation. This documentation can be +used to generate markdown documentation for the dialect and is used by upstream +[MLIR dialects](https://mlir.llvm.org/docs/Dialects/). + +### Class Name + +The name of the C++ class which gets generated is the same as the name of our TableGen +dialect definition, but with any `_` characters stripped out. This means that if you name +your dialect `Foo_Dialect`, the generated C++ class would be `FooDialect`. In the example +above, we would get a C++ dialect named `MyDialect`. + +### C++ Namespace + +The namespace that the C++ class for our dialect, and all of its sub-components, is placed +under is specified by the `cppNamespace` field. By default, uses the name of the dialect as +the only namespace. To avoid placing in any namespace, use `""`. To specify nested namespaces, +use `"::"` as the delimiter between namespace, e.g., given `"A::B"`, C++ classes will be placed +within: `namespace A { namespace B { } }`. + +Note that this works in conjunction with the dialect's C++ code. Depending on how the generated files +are included, you may want to specify a full namespace path or a partial one. In general, it's best +to use full namespaces whenever you can. This makes it easier for dialects within different namespaces, +and projects, to interact with each other. + +### Dependent Dialects + +MLIR has a very large ecosystem, and contains dialects that server many different purposes. It +is quite common, given the above, that dialects may want to reuse certain components from other +dialects. This may mean generating operations from those dialects during canonicalization, reusing +attributes or types, etc. When a dialect has a dependency on another, i.e. when it constructs and/or +generally relies on the components of another dialect, a dialect dependency should be explicitly +recorded. An explicitly dependency ensures that dependent dialects are loaded alongside the main +dialect. Dialect dependencies can be recorded using the `dependentDialects` dialects field: + +```tablegen +def MyDialect : Dialect { + // Here we register the Arithmetic and Func dialect as dependencies of our `MyDialect`. + let dependentDialects = [ + "arith::ArithmeticDialect", + "func::FuncDialect" + ]; +} +``` + +### Extra declarations + +The declarative Dialect definitions try to auto-generate as much logic and methods +as possible. With that said, there will always be long-tail cases that won't be covered. +For such cases, `extraClassDeclaration` can be used. Code within the `extraClassDeclaration` +field will be copied literally to the generated C++ Dialect class. + +Note that `extraClassDeclaration` is a mechanism intended for long-tail cases by +power users; for not-yet-implemented widely-applicable cases, improving the +infrastructure is preferable. + +### `hasConstantMaterializer`: Materializing Constants from Attributes + +This field is utilized to materialize a constant operation from an `Attribute` value and +a `Type`. This is generally used when an operation within this dialect has been folded, +and a constant operation should be generated. `hasConstantMaterializer` is used to enable +this materialization, and the `materializeConstant` hook is declared on the dialect. This +hook takes in an `Attribute` value, generally returned by `fold`, and produces a +"constant-like" operation that materializes that value. See the +[documentation](Canonicalization.md) for canonicalization for a more in-depth +introduction to `folding` in MLIR. + +Constants can then be materialized in the source file: + +```c++ +/// Hook to materialize a single constant operation from a given attribute value +/// with the desired resultant type. This method should use the provided builder +/// to create the operation without changing the insertion position. The +/// generated operation is expected to be constant-like. On success, this hook +/// should return the value generated to represent the constant value. +/// Otherwise, it should return nullptr on failure. +Operation *MyDialect::materializeConstant(OpBuilder &builder, Attribute value, + Type type, Location loc) { + ... +} +``` + +### `hasNonDefaultDestructor`: Providing a custom destructor + +This field should be used when the Dialect class has a custom destructor, i.e. +when the dialect has some special logic to be run in the `~MyDialect`. In this case, +only the declaration of the destructor is generated for the Dialect class. + +### Discardable Attribute Verification + +As described by the [MLIR Language Reference](LangRef.md#attributes), +*discardable attribute* are a type of attribute that has its semantics defined +by the dialect whose name prefixes that of the attribute. For example, if an +operation has an attribute named `gpu.contained_module`, the `gpu` dialect +defines the semantics and invariants, such as when and where it is valid to use, +of that attribute. To hook into this verification for attributes that are prefixed +by our dialect, several hooks on the Dialect may be used: + +#### `hasOperationAttrVerify` + +This field generates the hook for verifying when a discardable attribute of this dialect +has been used within the attribute dictionary of an operation. This hook has the form: + +```c++ +/// Generate verification for the given attribute, whose name is prefixed by the namespace +/// of this dialect, that was used in `op`s dictionary. +LogicalResult MyDialect::verifyOperationAttribute(Operation *op, NamedAttribute attribute); +``` + +#### `hasRegionArgAttrVerify` + +This field generates the hook for verifying when a discardable attribute of this dialect +has been used within the attribute dictionary of a region entry block argument. Note that +the block arguments of a region entry block do not themselves have attribute dictionaries, +but some operations may provide special dictionary attributes that correspond to the arguments +of a region. For example, operations that implement `FunctionOpInterface` may have attribute +dictionaries on the operation that correspond to the arguments of entry block of the function. +In these cases, those operations will invoke this hook on the dialect to ensure the attribute +is verified. The hook necessary for the dialect to implement has the form: + +```c++ +/// Generate verification for the given attribute, whose name is prefixed by the namespace +/// of this dialect, that was used on the attribute dictionary of a region entry block argument. +/// Note: As described above, when a region entry block has a dictionary is up to the individual +/// operation to define. +LogicalResult MyDialect::verifyRegionArgAttribute(Operation *op, unsigned regionIndex, + unsigned argIndex, NamedAttribute attribute); +``` + +#### `hasRegionResultAttrVerify` + +This field generates the hook for verifying when a discardable attribute of this dialect +has been used within the attribute dictionary of a region result. Note that the results of a +region do not themselves have attribute dictionaries, but some operations may provide special +dictionary attributes that correspond to the results of a region. For example, operations that +implement `FunctionOpInterface` may have attribute dictionaries on the operation that correspond +to the results of the function. In these cases, those operations will invoke this hook on the +dialect to ensure the attribute is verified. The hook necessary for the dialect to implement +has the form: + +```c++ +/// Generate verification for the given attribute, whose name is prefixed by the namespace +/// of this dialect, that was used on the attribute dictionary of a region result. +/// Note: As described above, when a region entry block has a dictionary is up to the individual +/// operation to define. +LogicalResult MyDialect::verifyRegionResultAttribute(Operation *op, unsigned regionIndex, + unsigned argIndex, NamedAttribute attribute); +``` + +### Operation Interface Fallback + +Some dialects have an open ecosystem and don't register all of the possible operations. In such +cases it is still possible to provide support for implementing an `OpInterface` for these +operations. When an operation isn't registered or does not provide an implementation for an +interface, the query will fallback to the dialect itself. The `hasOperationInterfaceFallback` +field may be used to declare this fallback for operations: + +```c++ +/// Return an interface model for the interface with the given `typeId` for the operation +/// with the given name. +void *MyDialect::getRegisteredInterfaceForOp(TypeID typeID, StringAttr opName); +``` + +For a more detail description of the expected usages of this hook, view the detailed +[interface documentation](Interfaces.md#dialect-fallback-for-opinterface). + +### Default Attribute/Type Parsers and Printers + +When a dialect registers an Attribute or Type, it must also override the respective +`Dialect::parseAttribute`/`Dialect::printAttribute` or +`Dialect::parseType`/`Dialect::printType` methods. In these cases, the dialect must +explicitly handle the parsing and printing of each individual attribute or type within +the dialect. If all of the attributes and types of the dialect provide a mnemonic, +however, these methods may be autogenerated by using the +`useDefaultAttributePrinterParser` and `useDefaultTypePrinterParser` fields. By default, +these fields are set to `1`(enabled), meaning that if a dialect needs to explicitly handle the +parser and printer of its Attributes and Types it should set these to `0` as necessary. + +### Dialect-wide Canonicalization Patterns + +Generally, [canonicalization](Canonicalization.md) patterns are specific to individual +operations within a dialect. There are some cases, however, that prompt canonicalization +patterns to be added to the dialect-level. For example, if a dialect defines a canonicalization +pattern that operates on an interface or trait, it can be beneficial to only add this pattern +once, instead of duplicating per-operation that implements that interface. To enable the +generation of this hook, the `hasCanonicalizer` field may be used. This will declare +the `getCanonicalizationPatterns` method on the dialect, which has the form: + +```c++ +/// Return the canonicalization patterns for this dialect: +void MyDialect::getCanonicalizationPatterns(RewritePatternSet &results) const; +``` + +See the documentation for [Canonicalization in MLIR](Canonicalization.md) for a much more +detailed description about canonicalization patterns. + +### C++ Accessor Prefix + +Historically, MLIR has generated accessors for operation components (such as attribute, operands, +results) using the tablegen definition name verbatim. This means that if an operation was defined +as: + +```tablegen +def MyOp : MyDialect<"op"> { + let arguments = (ins StrAttr:$value); +} +``` + +It would have accessors generated for the `value` attribute as follows: + +```c++ +StringAttr MyOp::value(); +void MyOp::value(StringAttr newValue); +``` + +Since then, we have decided to move accessors over to a style that matches the rest of the +code base. More specifically, this means that we prefix accessors with `get` and `set` +respectively. If we look at the same example as above, this would produce: + +```c++ +StringAttr MyOp::getValue(); +void MyOp::setValue(StringAttr newValue); +``` + +The form in which accessors are generated is controlled by the `emitAccessorPrefix` field. +This field may any of the following values: + +* `kEmitAccessorPrefix_Raw` + - Don't emit any `get`/`set` prefix. + +* `kEmitAccessorPrefix_Prefixed` + - Only emit with `get`/`set` prefix. + +* `kEmitAccessorPrefix_Both` + - Emit with **and** without prefix. + +All new dialects are strongly encouraged to use the `kEmitAccessorPrefix_Prefixed` value, as +the `Raw` form is deprecated and in the process of being removed. + +Note: Remove this section when all dialects have been switched to the new accessor form.