diff --git a/README.md b/README.md --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # The LLVM Compiler Infrastructure -This directory and its sub-directories contain source code for LLVM, +This directory and its sub-directories contain the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments. @@ -33,7 +33,7 @@ ### Getting the Source Code and Building LLVM -The LLVM Getting Started documentation may be out of date. The [Clang +The LLVM Getting Started documentation may be out of date. The [Clang Getting Started](http://clang.llvm.org/get_started.html) page might have more accurate information. @@ -101,7 +101,7 @@ LLVM sub-projects generate their own ``check-`` target. * Running a serial build will be **slow**. To improve speed, try running a - parallel build. That's done by default in Ninja; for ``make``, use the option + parallel build. That's done by default in Ninja; for ``make``, use the option ``-j NNN``, where ``NNN`` is the number of parallel jobs to run. In most cases, you get the best performance if you specify the number of CPU threads you have. On some Unix systems, you can specify this with ``-j$(nproc)``. diff --git a/README.md.rej b/README.md.rej new file mode 100644 --- /dev/null +++ b/README.md.rej @@ -0,0 +1,127 @@ +diff a/README.md b/README.md (rejected hunks) +@@ -1,122 +1,122 @@ + # The LLVM Compiler Infrastructure + +-This directory and its sub-directories contain source code for LLVM, ++This directory and its sub-directories contain the source code for LLVM, + a toolkit for the construction of highly optimized compilers, + optimizers, and run-time environments. + + The README briefly describes how to get started with building LLVM. + For more information on how to contribute to the LLVM project, please + take a look at the + [Contributing to LLVM](https://llvm.org/docs/Contributing.html) guide. + + ## Getting Started with the LLVM System + + Taken from [here](https://llvm.org/docs/GettingStarted.html). + + ### Overview + + Welcome to the LLVM project! + + The LLVM project has multiple components. The core of the project is + itself called "LLVM". This contains all of the tools, libraries, and header + files needed to process intermediate representations and convert them into + object files. Tools include an assembler, disassembler, bitcode analyzer, and + bitcode optimizer. It also contains basic regression tests. + + C-like languages use the [Clang](http://clang.llvm.org/) frontend. This + component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode + -- and from there into object files, using LLVM. + + Other components include: + the [libc++ C++ standard library](https://libcxx.llvm.org), + the [LLD linker](https://lld.llvm.org), and more. + + ### Getting the Source Code and Building LLVM + +-The LLVM Getting Started documentation may be out of date. The [Clang ++The LLVM Getting Started documentation may be out of date. The [Clang + Getting Started](http://clang.llvm.org/get_started.html) page might have more + accurate information. + + This is an example work-flow and configuration to get and build the LLVM source: + + 1. Checkout LLVM (including related sub-projects like Clang): + + * ``git clone https://github.com/llvm/llvm-project.git`` + + * Or, on windows, ``git clone --config core.autocrlf=false + https://github.com/llvm/llvm-project.git`` + + 2. Configure and build LLVM and Clang: + + * ``cd llvm-project`` + + * ``cmake -S llvm -B build -G [options]`` + + Some common build system generators are: + + * ``Ninja`` --- for generating [Ninja](https://ninja-build.org) + build files. Most llvm developers use Ninja. + * ``Unix Makefiles`` --- for generating make-compatible parallel makefiles. + * ``Visual Studio`` --- for generating Visual Studio projects and + solutions. + * ``Xcode`` --- for generating Xcode projects. + + Some common options: + + * ``-DLLVM_ENABLE_PROJECTS='...'`` and ``-DLLVM_ENABLE_RUNTIMES='...'`` --- + semicolon-separated list of the LLVM sub-projects and runtimes you'd like to + additionally build. ``LLVM_ENABLE_PROJECTS`` can include any of: clang, + clang-tools-extra, cross-project-tests, flang, libc, libclc, lld, lldb, + mlir, openmp, polly, or pstl. ``LLVM_ENABLE_RUNTIMES`` can include any of + libcxx, libcxxabi, libunwind, compiler-rt, libc or openmp. Some runtime + projects can be specified either in ``LLVM_ENABLE_PROJECTS`` or in + ``LLVM_ENABLE_RUNTIMES``. + + For example, to build LLVM, Clang, libcxx, and libcxxabi, use + ``-DLLVM_ENABLE_PROJECTS="clang" -DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi"``. + + * ``-DCMAKE_INSTALL_PREFIX=directory`` --- Specify for *directory* the full + path name of where you want the LLVM tools and libraries to be installed + (default ``/usr/local``). Be careful if you install runtime libraries: if + your system uses those provided by LLVM (like libc++ or libc++abi), you + must not overwrite your system's copy of those libraries, since that + could render your system unusable. In general, using something like + ``/usr`` is not advised, but ``/usr/local`` is fine. + + * ``-DCMAKE_BUILD_TYPE=type`` --- Valid options for *type* are Debug, + Release, RelWithDebInfo, and MinSizeRel. Default is Debug. + + * ``-DLLVM_ENABLE_ASSERTIONS=On`` --- Compile with assertion checks enabled + (default is Yes for Debug builds, No for all other build types). + + * ``cmake --build build [-- [options] ]`` or your build system specified above + directly. + + * The default target (i.e. ``ninja`` or ``make``) will build all of LLVM. + + * The ``check-all`` target (i.e. ``ninja check-all``) will run the + regression tests to ensure everything is in working order. + + * CMake will generate targets for each tool and library, and most + LLVM sub-projects generate their own ``check-`` target. + + * Running a serial build will be **slow**. To improve speed, try running a +- parallel build. That's done by default in Ninja; for ``make``, use the option ++ parallel build. That's done by default in Ninja; for ``make``, use the option + ``-j NNN``, where ``NNN`` is the number of parallel jobs to run. + In most cases, you get the best performance if you specify the number of CPU threads you have. + On some Unix systems, you can specify this with ``-j$(nproc)``. + + * For more information see [CMake](https://llvm.org/docs/CMake.html). + + Consult the + [Getting Started with LLVM](https://llvm.org/docs/GettingStarted.html#getting-started-with-llvm) + page for detailed information on configuring and compiling LLVM. You can visit + [Directory Layout](https://llvm.org/docs/GettingStarted.html#directory-layout) + to learn about the layout of the source code tree. + + ## Getting in touch + + Join [LLVM Discourse forums](https://discourse.llvm.org/), [discord chat](https://discord.gg/xS7Z362) or #llvm IRC channel on [OFTC](https://oftc.net/). + + The LLVM project has adopted a [code of conduct](https://llvm.org/docs/CodeOfConduct.html) for + participants to all modes of communication within the project. diff --git a/mlir/README.md b/mlir/README.md --- a/mlir/README.md +++ b/mlir/README.md @@ -1,3 +1,39 @@ -# Multi-Level Intermediate Representation +# MLIR +**Please visit the [website](https://mlir.llvm.org) for more information.**
+- *It is suggested to visit the website for better understanding.* + +# Overview +The MLIR project is a novel approach to building reusable and extensible compiler infrastructure. MLIR aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building domain specific compilers, and aid in connecting existing compilers together. + +MLIR is a common IR that also supports hardware specific operations. Thus, any investment into the infrastructure surrounding MLIR (e.g. the compiler passes that work on it) should yield good returns; many targets can use that infrastructure and will benefit from it. MLIR is a powerful representation, but it also has non-goals. We do not try to support low level machine code generation algorithms (like register allocation and instruction scheduling). They are a better fit for lower level optimizers (such as LLVM). Also, we do not intend MLIR to be a source language that end-users would themselves write kernels in (analogous to CUDA C++). On the other hand, MLIR provides the backbone for representing any such DSL and integrating it in the ecosystem. + +*If you’d like to discuss a particular topic or have questions, please add it to the agenda [doc](https://docs.google.com/document/d/1y_9f1AbfgcoVdJh4_aM6-BaSHvrHl8zuA5G4jv_94K8/edit#). Details on how to join the meeting are in the agenda doc, you can get a Google calendar invite by joining [this](https://groups.google.com/a/tensorflow.org/g/mlir) googlegroup. The meetings are recorded and published in the [talks](https://mlir.llvm.org/talks/) section.* + + +# Resources + +For more information on MLIR, please see: +- The MLIR section of the LLVM [forums](https://llvm.discourse.group/c/mlir/31) for any questions. +- Real-time discussion on the MLIR channel of the LLVM [discord](https://discord.gg/xS7Z362) server. +- Previous [talks](https://mlir.llvm.org/talks/). + +# MLIR (Multi-Level Intermediate Representation) + +**MLIR is intended to be a hybrid IR which can support multiple different requirements in a unified infrastructure. For example, this includes:** +- The ability to represent dataflow graphs (such as in TensorFlow), including dynamic shapes, the user-extensible op ecosystem, TensorFlow variables, etc. +- Optimizations and transformations typically done on such graphs (e.g. in Grappler). +- Ability to host high-performance-computing-style loop optimizations across kernels (fusion, loop interchange, tiling, etc.), and to transform memory layouts of data. +- Code generation “lowering” transformations such as DMA insertion, explicit cache management, memory tiling, and vectorization for 1D and 2D register architectures. +- Ability to represent target-specific operations, e.g. accelerator-specific high-level operations. +- Quantization and other graph transformations done on a Deep-Learning graph. +- [Polyhedral](https://mlir.llvm.org/docs/Dialects/Affine/) primitives. +- [HTS](https://circt.llvm.org/) (Hardware Synthesis Tools) + +# FAQ + Please visit the [site](https://mlir.llvm.org/getting_started/Faq) to see frequently asked questions. + +# Citing +Please see the [FAQ entry](https://mlir.llvm.org/getting_started/Faq/#how-to-refer-to-mlir-in-publications-is-there-an-accompanying-paper) on how to cite MLIR in publications. + + -See [https://mlir.llvm.org/](https://mlir.llvm.org/) for more information. diff --git a/mlir/README.md.rej b/mlir/README.md.rej new file mode 100644 --- /dev/null +++ b/mlir/README.md.rej @@ -0,0 +1,43 @@ +diff a/mlir/README.md b/mlir/README.md (rejected hunks) +@@ -1,3 +1,39 @@ +-# Multi-Level Intermediate Representation ++# MLIR ++**Please visit the [website](https://mlir.llvm.org) for more information.**
++- *It is suggested to visit the website for better understanding.* ++ ++# Overview ++The MLIR project is a novel approach to building reusable and extensible compiler infrastructure. MLIR aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building domain specific compilers, and aid in connecting existing compilers together. ++ ++MLIR is a common IR that also supports hardware specific operations. Thus, any investment into the infrastructure surrounding MLIR (e.g. the compiler passes that work on it) should yield good returns; many targets can use that infrastructure and will benefit from it. MLIR is a powerful representation, but it also has non-goals. We do not try to support low level machine code generation algorithms (like register allocation and instruction scheduling). They are a better fit for lower level optimizers (such as LLVM). Also, we do not intend MLIR to be a source language that end-users would themselves write kernels in (analogous to CUDA C++). On the other hand, MLIR provides the backbone for representing any such DSL and integrating it in the ecosystem. ++ ++*If you’d like to discuss a particular topic or have questions, please add it to the agenda [doc](https://docs.google.com/document/d/1y_9f1AbfgcoVdJh4_aM6-BaSHvrHl8zuA5G4jv_94K8/edit#). Details on how to join the meeting are in the agenda doc, you can get a Google calendar invite by joining [this](https://groups.google.com/a/tensorflow.org/g/mlir) googlegroup. The meetings are recorded and published in the [talks](https://mlir.llvm.org/talks/) section.* ++ ++ ++# Resources ++ ++For more information on MLIR, please see: ++- The MLIR section of the LLVM [forums](https://llvm.discourse.group/c/mlir/31) for any questions. ++- Real-time discussion on the MLIR channel of the LLVM [discord](https://discord.gg/xS7Z362) server. ++- Previous [talks](https://mlir.llvm.org/talks/). ++ ++# MLIR (Multi-Level Intermediate Representation) ++ ++**MLIR is intended to be a hybrid IR which can support multiple different requirements in a unified infrastructure. For example, this includes:** ++- The ability to represent dataflow graphs (such as in TensorFlow), including dynamic shapes, the user-extensible op ecosystem, TensorFlow variables, etc. ++- Optimizations and transformations typically done on such graphs (e.g. in Grappler). ++- Ability to host high-performance-computing-style loop optimizations across kernels (fusion, loop interchange, tiling, etc.), and to transform memory layouts of data. ++- Code generation “lowering” transformations such as DMA insertion, explicit cache management, memory tiling, and vectorization for 1D and 2D register architectures. ++- Ability to represent target-specific operations, e.g. accelerator-specific high-level operations. ++- Quantization and other graph transformations done on a Deep-Learning graph. ++- [Polyhedral](https://mlir.llvm.org/docs/Dialects/Affine/) primitives. ++- [HTS](https://circt.llvm.org/) (Hardware Synthesis Tools) ++ ++# FAQ ++ Please visit the [site](https://mlir.llvm.org/getting_started/Faq) to see frequently asked questions. ++ ++# Citing ++Please see the [FAQ entry](https://mlir.llvm.org/getting_started/Faq/#how-to-refer-to-mlir-in-publications-is-there-an-accompanying-paper) on how to cite MLIR in publications. ++ ++ + +-See [https://mlir.llvm.org/](https://mlir.llvm.org/) for more information. diff --git a/mlir/docs/Interfaces.md b/mlir/docs/Interfaces.md --- a/mlir/docs/Interfaces.md +++ b/mlir/docs/Interfaces.md @@ -37,7 +37,7 @@ referenced later. Once the interface has been defined, dialects can override it using dialect-specific information. The interfaces defined by a dialect are registered via `addInterfaces<>`, a similar mechanism to Attributes, Operations, -Types, etc +Types, etc. ```c++ /// Define a base inlining interface class to allow for dialects to opt-in to @@ -86,7 +86,7 @@ #### DialectInterfaceCollection An additional utility is provided via `DialectInterfaceCollection`. This class -allows for collecting all of the dialects that have registered a given interface +allows collecting all of the dialects that have registered a given interface within an instance of the `MLIRContext`. This can be useful to hide and optimize the lookup of a registered dialect interface. @@ -394,8 +394,8 @@ accessed with full name qualification. * Extra Shared Class Declarations (Optional: `extraSharedClassDeclaration`) - Additional C++ code that is injected into the declarations of both the - interface and trait class. This allows for defining methods and more - that are exposed on both the interface and trait class, e.g. to inject + interface and the trait class. This allows for defining methods and more + that are exposed on both the interface and the trait class, e.g. to inject utilties on both the interface and the derived entity implementing the interface (e.g. attribute, operation, etc.). - In non-static methods, `$_attr`/`$_op`/`$_type` @@ -617,7 +617,7 @@ } // Operation interfaces can optionally be wrapped inside -// DeclareOpInterfaceMethods. This would result in autogenerating declarations +// `DeclareOpInterfaceMethods`. This would result in autogenerating declarations // for members `foo`, `bar` and `fooStatic`. Methods with bodies are not // declared inside the op declaration but instead handled by the op interface // trait directly. diff --git a/mlir/docs/Interfaces.md.rej b/mlir/docs/Interfaces.md.rej new file mode 100644 --- /dev/null +++ b/mlir/docs/Interfaces.md.rej @@ -0,0 +1,686 @@ +diff a/mlir/docs/Interfaces.md b/mlir/docs/Interfaces.md (rejected hunks) +@@ -1,679 +1,679 @@ + # Interfaces + + MLIR is a generic and extensible framework, representing different dialects with + their own attributes, operations, types, and so on. MLIR Dialects can express + operations with a wide variety of semantics and different levels of abstraction. + The downside to this is that MLIR transformations and analyses need to be able + to account for the semantics of every operation, or be overly conservative. + Without care, this can result in code with special-cases for each supported + operation type. To combat this, MLIR provides a concept of `interfaces`. + + ## Motivation + + Interfaces provide a generic way of interacting with the IR. The goal is to be + able to express transformations/analyses in terms of these interfaces without + encoding specific knowledge about the exact operation or dialect involved. This + makes the compiler more easily extensible by allowing the addition of new + dialects and operations in a decoupled way with respect to the implementation of + transformations/analyses. + + ### Dialect Interfaces + + Dialect interfaces are generally useful for transformation passes or analyses + that want to operate generically on a set of attributes/operations/types, which + may be defined in different dialects. These interfaces generally involve wide + coverage over an entire dialect and are only used for a handful of analyses or + transformations. In these cases, registering the interface directly on each + operation is overly complex and cumbersome. The interface is not core to the + operation, just to the specific transformation. An example of where this type of + interface would be used is inlining. Inlining generally queries high-level + information about the operations within a dialect, like cost modeling and + legality, that often is not specific to one operation. + + A dialect interface can be defined by inheriting from the + [CRTP](https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern) base + class `DialectInterfaceBase::Base<>`. This class provides the necessary + utilities for registering an interface with a dialect so that it can be + referenced later. Once the interface has been defined, dialects can override it + using dialect-specific information. The interfaces defined by a dialect are + registered via `addInterfaces<>`, a similar mechanism to Attributes, Operations, +-Types, etc ++Types, etc. + + ```c++ + /// Define a base inlining interface class to allow for dialects to opt-in to + /// the inliner. + class DialectInlinerInterface : + public DialectInterface::Base { + public: + /// Returns true if the given region 'src' can be inlined into the region + /// 'dest' that is attached to an operation registered to the current dialect. + /// 'valueMapping' contains any remapped values from within the 'src' region. + /// This can be used to examine what values will replace entry arguments into + /// the 'src' region, for example. + virtual bool isLegalToInline(Region *dest, Region *src, + BlockAndValueMapping &valueMapping) const { + return false; + } + }; + + /// Override the inliner interface to add support for the AffineDialect to + /// enable inlining affine operations. + struct AffineInlinerInterface : public DialectInlinerInterface { + /// Affine structures have specific inlining constraints. + bool isLegalToInline(Region *dest, Region *src, + BlockAndValueMapping &valueMapping) const final { + ... + } + }; + + /// Register the interface with the dialect. + AffineDialect::AffineDialect(MLIRContext *context) ... { + addInterfaces(); + } + ``` + + Once registered, these interfaces can be queried from the dialect by an analysis + or transformation without the need to determine the specific dialect subclass: + + ```c++ + Dialect *dialect = ...; + if (DialectInlinerInterface *interface = dyn_cast(dialect)) { + // The dialect has provided an implementation of this interface. + ... + } + ``` + + #### DialectInterfaceCollection + + An additional utility is provided via `DialectInterfaceCollection`. This class +-allows for collecting all of the dialects that have registered a given interface ++allows collecting all of the dialects that have registered a given interface + within an instance of the `MLIRContext`. This can be useful to hide and optimize + the lookup of a registered dialect interface. + + ```c++ + class InlinerInterface : public + DialectInterfaceCollection { + /// The hooks for this class mirror the hooks for the DialectInlinerInterface, + /// with default implementations that call the hook on the interface for a + /// given dialect. + virtual bool isLegalToInline(Region *dest, Region *src, + BlockAndValueMapping &valueMapping) const { + auto *handler = getInterfaceFor(dest->getContainingOp()); + return handler ? handler->isLegalToInline(dest, src, valueMapping) : false; + } + }; + + MLIRContext *ctx = ...; + InlinerInterface interface(ctx); + if(!interface.isLegalToInline(...)) + ... + ``` + + ### Attribute/Operation/Type Interfaces + + Attribute/Operation/Type interfaces, as the names suggest, are those registered + at the level of a specific attribute/operation/type. These interfaces provide + access to derived objects by providing a virtual interface that must be + implemented. As an example, many analyses and transformations want to reason + about the side effects of an operation to improve performance and correctness. + The side effects of an operation are generally tied to the semantics of a + specific operation, for example an `affine.load` operation has a `read` effect + (as the name may suggest). + + These interfaces are defined by overriding the + [CRTP](https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern) class + for the specific IR entity; `AttrInterface`, `OpInterface`, or `TypeInterface` + respectively. These classes take, as a template parameter, a `Traits` class that + defines a `Concept` and a `Model` class. These classes provide an implementation + of concept-based polymorphism, where the `Concept` defines a set of virtual + methods that are overridden by the `Model` that is templated on the concrete + entity type. It is important to note that these classes should be pure, and + should not contain non-static data members or other mutable data. To attach an + interface to an object, the base interface classes provide a + [`Trait`](Traits.md) class that can be appended to the trait list of that + object. + + ```c++ + struct ExampleOpInterfaceTraits { + /// Define a base concept class that specifies the virtual interface to be + /// implemented. + struct Concept { + virtual ~Concept(); + + /// This is an example of a non-static hook to an operation. + virtual unsigned exampleInterfaceHook(Operation *op) const = 0; + + /// This is an example of a static hook to an operation. A static hook does + /// not require a concrete instance of the operation. The implementation is + /// a virtual hook, the same as the non-static case, because the + /// implementation of the hook itself still requires indirection. + virtual unsigned exampleStaticInterfaceHook() const = 0; + }; + + /// Define a model class that specializes a concept on a given operation type. + template + struct Model : public Concept { + /// Override the method to dispatch on the concrete operation. + unsigned exampleInterfaceHook(Operation *op) const final { + return llvm::cast(op).exampleInterfaceHook(); + } + + /// Override the static method to dispatch to the concrete operation type. + unsigned exampleStaticInterfaceHook() const final { + return ConcreteOp::exampleStaticInterfaceHook(); + } + }; + }; + + /// Define the main interface class that analyses and transformations will + /// interface with. + class ExampleOpInterface : public OpInterface { + public: + /// Inherit the base class constructor to support LLVM-style casting. + using OpInterface::OpInterface; + + /// The interface dispatches to 'getImpl()', a method provided by the base + /// `OpInterface` class that returns an instance of the concept. + unsigned exampleInterfaceHook() const { + return getImpl()->exampleInterfaceHook(getOperation()); + } + unsigned exampleStaticInterfaceHook() const { + return getImpl()->exampleStaticInterfaceHook(getOperation()->getName()); + } + }; + + ``` + + Once the interface has been defined, it is registered to an operation by adding + the provided trait `ExampleOpInterface::Trait` as described earlier. Using this + interface is just like using any other derived operation type, i.e. casting: + + ```c++ + /// When defining the operation, the interface is registered via the nested + /// 'Trait' class provided by the 'OpInterface<>' base class. + class MyOp : public Op { + public: + /// The definition of the interface method on the derived operation. + unsigned exampleInterfaceHook() { return ...; } + static unsigned exampleStaticInterfaceHook() { return ...; } + }; + + /// Later, we can query if a specific operation(like 'MyOp') overrides the given + /// interface. + Operation *op = ...; + if (ExampleOpInterface example = dyn_cast(op)) + llvm::errs() << "hook returned = " << example.exampleInterfaceHook() << "\n"; + ``` + + #### External Models for Attribute, Operation and Type Interfaces + + It may be desirable to provide an interface implementation for an IR object + without modifying the definition of said object. Notably, this allows to + implement interfaces for attributes, operations and types outside of the dialect + that defines them, for example, to provide interfaces for built-in types. + + This is achieved by extending the concept-based polymorphism model with two more + classes derived from `Concept` as follows. + + ```c++ + struct ExampleTypeInterfaceTraits { + struct Concept { + virtual unsigned exampleInterfaceHook(Type type) const = 0; + virtual unsigned exampleStaticInterfaceHook() const = 0; + }; + + template + struct Model : public Concept { /*...*/ }; + + /// Unlike `Model`, `FallbackModel` passes the type object through to the + /// hook, making it accessible in the method body even if the method is not + /// defined in the class itself and thus has no `this` access. ODS + /// automatically generates this class for all interfaces. + template + struct FallbackModel : public Concept { + unsigned exampleInterfaceHook(Type type) const override { + getImpl()->exampleInterfaceHook(type); + } + unsigned exampleStaticInterfaceHook() const override { + ConcreteType::exampleStaticInterfaceHook(); + } + }; + + /// `ExternalModel` provides a place for default implementations of interface + /// methods by explicitly separating the model class, which implements the + /// interface, from the type class, for which the interface is being + /// implemented. Default implementations can be then defined generically + /// making use of `cast`. If `ConcreteType` does not provide + /// the APIs required by the default implementation, custom implementations + /// may use `FallbackModel` directly to override the default implementation. + /// Being located in a class template, it never gets instantiated and does not + /// lead to compilation errors. ODS automatically generates this class and + /// places default method implementations in it. + template + struct ExternalModel : public FallbackModel { + unsigned exampleInterfaceHook(Type type) const override { + // Default implementation can be provided here. + return type.cast().callSomeTypeSpecificMethod(); + } + }; + }; + ``` + + External models can be provided for attribute, operation and type interfaces by + deriving either `FallbackModel` or `ExternalModel` and by registering the model + class with the relevant class in a given context. Other contexts will not see + the interface unless registered. + + ```c++ + /// External interface implementation for a concrete class. This does not + /// require modifying the definition of the type class itself. + struct ExternalModelExample + : public ExampleTypeInterface::ExternalModel { + static unsigned exampleStaticInterfaceHook() { + // Implementation is provided here. + return IntegerType::someStaticMethod(); + } + + // No need to define `exampleInterfaceHook` that has a default implementation + // in `ExternalModel`. But it can be overridden if desired. + } + + int main() { + MLIRContext context; + /* ... */; + + // Attach the interface model to the type in the given context before + // using it. The dialect containing the type is expected to have been loaded + // at this point. + IntegerType::attachInterface(context); + } + ``` + + Note: It is strongly encouraged to only use this mechanism if you "own" the + interface being externally applied. This prevents a situation where neither the + owner of the dialect containing the object nor the owner of the interface are + aware of an interface implementation, which can lead to duplicate or + diverging implementations. + + #### Dialect Fallback for OpInterface + + Some dialects have an open ecosystem and don't register all of the possible + operations. In such cases it is still possible to provide support for + implementing an `OpInterface` for these operation. When an operation isn't + registered or does not provide an implementation for an interface, the query + will fallback to the dialect itself. + + A second model is used for such cases and automatically generated when using ODS + (see below) with the name `FallbackModel`. This model can be implemented for a + particular dialect: + + ```c++ + // This is the implementation of a dialect fallback for `ExampleOpInterface`. + struct FallbackExampleOpInterface + : public ExampleOpInterface::FallbackModel< + FallbackExampleOpInterface> { + static bool classof(Operation *op) { return true; } + + unsigned exampleInterfaceHook(Operation *op) const; + unsigned exampleStaticInterfaceHook() const; + }; + ``` + + A dialect can then instantiate this implementation and returns it on specific + operations by overriding the `getRegisteredInterfaceForOp` method : + + ```c++ + void *TestDialect::getRegisteredInterfaceForOp(TypeID typeID, + StringAttr opName) { + if (typeID == TypeID::get()) { + if (isSupported(opName)) + return fallbackExampleOpInterface; + return nullptr; + } + return nullptr; + } + ``` + + #### Utilizing the ODS Framework + + Note: Before reading this section, the reader should have some familiarity with + the concepts described in the + [`Operation Definition Specification`](OpDefinitions.md) documentation. + + As detailed above, [Interfaces](#attributeoperationtype-interfaces) allow for + attributes, operations, and types to expose method calls without requiring that + the caller know the specific derived type. The downside to this infrastructure, + is that it requires a bit of boiler plate to connect all of the pieces together. + MLIR provides a mechanism with which to defines interfaces declaratively in ODS, + and have the C++ definitions auto-generated. + + As an example, using the ODS framework would allow for defining the example + interface above as: + + ```tablegen + def ExampleOpInterface : OpInterface<"ExampleOpInterface"> { + let description = [{ + This is an example interface definition. + }]; + + let methods = [ + InterfaceMethod< + "This is an example of a non-static hook to an operation.", + "unsigned", "exampleInterfaceHook" + >, + StaticInterfaceMethod< + "This is an example of a static hook to an operation.", + "unsigned", "exampleStaticInterfaceHook" + >, + ]; + } + ``` + + Providing a definition of the `AttrInterface`, `OpInterface`, or `TypeInterface` + class will auto-generate the C++ classes for the interface. Interfaces are + comprised of the following components: + + * C++ Class Name (Provided via template parameter) + - The name of the C++ interface class. + * Description (`description`) + - A string description of the interface, its invariants, example usages, + etc. + * C++ Namespace (`cppNamespace`) + - The C++ namespace that the interface class should be generated in. + * Methods (`methods`) + - The list of interface hook methods that are defined by the IR object. + - The structure of these methods is defined below. + * Extra Class Declarations (Optional: `extraClassDeclaration`) + - Additional C++ code that is generated in the declaration of the + interface class. This allows for defining methods and more on the user + facing interface class, that do not need to hook into the IR entity. + These declarations are _not_ implicitly visible in default + implementations of interface methods, but static declarations may be + accessed with full name qualification. + * Extra Shared Class Declarations (Optional: `extraSharedClassDeclaration`) + - Additional C++ code that is injected into the declarations of both the +- interface and trait class. This allows for defining methods and more +- that are exposed on both the interface and trait class, e.g. to inject ++ interface and the trait class. This allows for defining methods and more ++ that are exposed on both the interface and the trait class, e.g. to inject + utilties on both the interface and the derived entity implementing the + interface (e.g. attribute, operation, etc.). + - In non-static methods, `$_attr`/`$_op`/`$_type` + (depending on the type of interface) may be used to refer to an + instance of the IR entity. In the interface declaration, the type of + the instance is the interface class. In the trait declaration, the + type of the instance is the concrete entity class + (e.g. `IntegerAttr`, `FuncOp`, etc.). + + `OpInterface` classes may additionally contain the following: + + * Verifier (`verify`) + - A C++ code block containing additional verification applied to the + operation that the interface is attached to. + - The structure of this code block corresponds 1-1 with the structure of a + [`Trait::verifyTrait`](Traits.md) method. + + There are two types of methods that can be used with an interface, + `InterfaceMethod` and `StaticInterfaceMethod`. They are both comprised of the + same core components, with the distinction that `StaticInterfaceMethod` models a + static method on the derived IR object. + + Interface methods are comprised of the following components: + + * Description + - A string description of this method, its invariants, example usages, + etc. + * ReturnType + - A string corresponding to the C++ return type of the method. + * MethodName + - A string corresponding to the C++ name of the method. + * Arguments (Optional) + - A dag of strings that correspond to a C++ type and variable name + respectively. + * MethodBody (Optional) + - An optional explicit implementation of the interface method. + - This implementation is placed within the method defined on the `Model` + traits class, and is not defined by the `Trait` class that is attached + to the IR entity. More concretely, this body is only visible by the + interface class and does not affect the derived IR entity. + - `ConcreteAttr`/`ConcreteOp`/`ConcreteType` is an implicitly defined + `typename` that can be used to refer to the type of the derived IR + entity currently being operated on. + - In non-static methods, `$_op` and `$_self` may be used to refer to an + instance of the derived IR entity. + * DefaultImplementation (Optional) + - An optional explicit default implementation of the interface method. + - This implementation is placed within the `Trait` class that is attached + to the IR entity, and does not directly affect any of the interface + classes. As such, this method has the same characteristics as any other + [`Trait`](Traits.md) method. + - `ConcreteAttr`/`ConcreteOp`/`ConcreteType` is an implicitly defined + `typename` that can be used to refer to the type of the derived IR + entity currently being operated on. + - This may refer to static fields of the interface class using the + qualified name, e.g., `TestOpInterface::staticMethod()`. + + ODS also allows for generating declarations for the `InterfaceMethod`s of an + operation if the operation specifies the interface with + `DeclareOpInterfaceMethods` (see an example below). + + Examples: + + ~~~tablegen + def MyInterface : OpInterface<"MyInterface"> { + let description = [{ + This is the description of the interface. It provides concrete information + on the semantics of the interface, and how it may be used by the compiler. + }]; + + let methods = [ + InterfaceMethod<[{ + This method represents a simple non-static interface method with no + inputs, and a void return type. This method is required to be implemented + by all operations implementing this interface. This method roughly + correlates to the following on an operation implementing this interface: + + ```c++ + class ConcreteOp ... { + public: + void nonStaticMethod(); + }; + ``` + }], "void", "nonStaticMethod" + >, + + InterfaceMethod<[{ + This method represents a non-static interface method with a non-void + return value, as well as an `unsigned` input named `i`. This method is + required to be implemented by all operations implementing this interface. + This method roughly correlates to the following on an operation + implementing this interface: + + ```c++ + class ConcreteOp ... { + public: + Value nonStaticMethod(unsigned i); + }; + ``` + }], "Value", "nonStaticMethodWithParams", (ins "unsigned":$i) + >, + + StaticInterfaceMethod<[{ + This method represents a static interface method with no inputs, and a + void return type. This method is required to be implemented by all + operations implementing this interface. This method roughly correlates + to the following on an operation implementing this interface: + + ```c++ + class ConcreteOp ... { + public: + static void staticMethod(); + }; + ``` + }], "void", "staticMethod" + >, + + StaticInterfaceMethod<[{ + This method corresponds to a static interface method that has an explicit + implementation of the method body. Given that the method body has been + explicitly implemented, this method should not be defined by the operation + implementing this method. This method merely takes advantage of properties + already available on the operation, in this case its `build` methods. This + method roughly correlates to the following on the interface `Model` class: + + ```c++ + struct InterfaceTraits { + /// ... The `Concept` class is elided here ... + + template + struct Model : public Concept { + Operation *create(OpBuilder &builder, Location loc) const override { + return builder.create(loc); + } + } + }; + ``` + + Note above how no modification is required for operations implementing an + interface with this method. + }], + "Operation *", "create", (ins "OpBuilder &":$builder, "Location":$loc), + /*methodBody=*/[{ + return builder.create(loc); + }]>, + + InterfaceMethod<[{ + This method represents a non-static method that has an explicit + implementation of the method body. Given that the method body has been + explicitly implemented, this method should not be defined by the operation + implementing this method. This method merely takes advantage of properties + already available on the operation, in this case its `build` methods. This + method roughly correlates to the following on the interface `Model` class: + + ```c++ + struct InterfaceTraits { + /// ... The `Concept` class is elided here ... + + template + struct Model : public Concept { + Operation *create(Operation *opaqueOp, OpBuilder &builder, + Location loc) const override { + ConcreteOp op = cast(opaqueOp); + return op.getNumInputs() + op.getNumOutputs(); + } + } + }; + ``` + + Note above how no modification is required for operations implementing an + interface with this method. + }], + "unsigned", "getNumInputsAndOutputs", (ins), /*methodBody=*/[{ + return $_op.getNumInputs() + $_op.getNumOutputs(); + }]>, + + InterfaceMethod<[{ + This method represents a non-static method that has a default + implementation of the method body. This means that the implementation + defined here will be placed in the trait class that is attached to every + operation that implements this interface. This has no effect on the + generated `Concept` and `Model` class. This method roughly correlates to + the following on the interface `Trait` class: + + ```c++ + template + class MyTrait : public OpTrait::TraitBase { + public: + bool isSafeToTransform() { + ConcreteOp op = cast(this->getOperation()); + return op.getNumInputs() + op.getNumOutputs(); + } + }; + ``` + + As detailed in [Traits](Traits.md), given that each operation implementing + this interface will also add the interface trait, the methods on this + interface are inherited by the derived operation. This allows for + injecting a default implementation of this method into each operation that + implements this interface, without changing the interface class itself. If + an operation wants to override this default implementation, it merely + needs to implement the method and the derived implementation will be + picked up transparently by the interface class. + + ```c++ + class ConcreteOp ... { + public: + bool isSafeToTransform() { + // Here we can override the default implementation of the hook + // provided by the trait. + } + }; + ``` + }], + "bool", "isSafeToTransform", (ins), /*methodBody=*/[{}], + /*defaultImplementation=*/[{ + }]>, + ]; + } + + // Operation interfaces can optionally be wrapped inside +-// DeclareOpInterfaceMethods. This would result in autogenerating declarations ++// `DeclareOpInterfaceMethods`. This would result in autogenerating declarations + // for members `foo`, `bar` and `fooStatic`. Methods with bodies are not + // declared inside the op declaration but instead handled by the op interface + // trait directly. + def OpWithInferTypeInterfaceOp : Op<... + [DeclareOpInterfaceMethods]> { ... } + + // Methods that have a default implementation do not have declarations + // generated. If an operation wishes to override the default behavior, it can + // explicitly specify the method that it wishes to override. This will force + // the generation of a declaration for those methods. + def OpWithOverrideInferTypeInterfaceOp : Op<... + [DeclareOpInterfaceMethods]> { ... } + ~~~ + + Note: Existing operation interfaces defined in C++ can be accessed in the ODS + framework via the `OpInterfaceTrait` class. + + #### Operation Interface List + + MLIR includes standard interfaces providing functionality that is likely to be + common across many different operations. Below is a list of some key interfaces + that may be used directly by any dialect. The format of the header for each + interface section goes as follows: + + * `Interface class name` + - (`C++ class` -- `ODS class`(if applicable)) + + ##### CallInterfaces + + * `CallOpInterface` - Used to represent operations like 'call' + - `CallInterfaceCallable getCallableForCallee()` + * `CallableOpInterface` - Used to represent the target callee of call. + - `Region * getCallableRegion()` + - `ArrayRef getCallableResults()` + + ##### RegionKindInterfaces + + * `RegionKindInterface` - Used to describe the abstract semantics of regions. + - `RegionKind getRegionKind(unsigned index)` - Return the kind of the + region with the given index inside this operation. + - RegionKind::Graph - represents a graph region without control flow + semantics + - RegionKind::SSACFG - represents an + [SSA-style control flow](LangRef.md/#control-flow-and-ssacfg-regions) region + with basic blocks and reachability + - `hasSSADominance(unsigned index)` - Return true if the region with the + given index inside this operation requires dominance. + + ##### SymbolInterfaces + + * `SymbolOpInterface` - Used to represent + [`Symbol`](SymbolsAndSymbolTables.md/#symbol) operations which reside + immediately within a region that defines a + [`SymbolTable`](SymbolsAndSymbolTables.md/#symbol-table). + + * `SymbolUserOpInterface` - Used to represent operations that reference + [`Symbol`](SymbolsAndSymbolTables.md/#symbol) operations. This provides the + ability to perform safe and efficient verification of symbol uses, as well + as additional functionality. diff --git a/mlir/docs/LangRef.md.rej b/mlir/docs/LangRef.md.rej new file mode 100644 --- /dev/null +++ b/mlir/docs/LangRef.md.rej @@ -0,0 +1,855 @@ +diff a/mlir/docs/LangRef.md b/mlir/docs/LangRef.md (rejected hunks) +@@ -1,849 +1,849 @@ + # MLIR Language Reference + + MLIR (Multi-Level IR) is a compiler intermediate representation with + similarities to traditional three-address SSA representations (like + [LLVM IR](http://llvm.org/docs/LangRef.html) or + [SIL](https://github.com/apple/swift/blob/main/docs/SIL.rst)), but which + introduces notions from polyhedral loop optimization as first-class concepts. + This hybrid design is optimized to represent, analyze, and transform high level + dataflow graphs as well as target-specific code generated for high performance + data parallel systems. Beyond its representational capabilities, its single + continuous design provides a framework to lower from dataflow graphs to + high-performance target-specific code. + + This document defines and describes the key concepts in MLIR, and is intended to + be a dry reference document - the + [rationale documentation](Rationale/Rationale.md), + [glossary](../getting_started/Glossary.md), and other content are hosted + elsewhere. + + MLIR is designed to be used in three different forms: a human-readable textual + form suitable for debugging, an in-memory form suitable for programmatic + transformations and analysis, and a compact serialized form suitable for storage +-and transport. The different forms all describe the same semantic content. This ++and transport. All the different forms describe the same semantic content. This + document describes the human-readable textual form. + + [TOC] + + ## High-Level Structure + + MLIR is fundamentally based on a graph-like data structure of nodes, called + *Operations*, and edges, called *Values*. Each Value is the result of exactly + one Operation or Block Argument, and has a *Value Type* defined by the + [type system](#type-system). [Operations](#operations) are contained in + [Blocks](#blocks) and Blocks are contained in [Regions](#regions). Operations + are also ordered within their containing block and Blocks are ordered in their + containing region, although this order may or may not be semantically meaningful + in a given [kind of region](Interfaces.md/#regionkindinterfaces)). Operations + may also contain regions, enabling hierarchical structures to be represented. + + Operations can represent many different concepts, from higher-level concepts + like function definitions, function calls, buffer allocations, view or slices of + buffers, and process creation, to lower-level concepts like target-independent + arithmetic, target-specific instructions, configuration registers, and logic + gates. These different concepts are represented by different operations in MLIR + and the set of operations usable in MLIR can be arbitrarily extended. + + MLIR also provides an extensible framework for transformations on operations, + using familiar concepts of compiler [Passes](Passes.md). Enabling an arbitrary + set of passes on an arbitrary set of operations results in a significant scaling + challenge, since each transformation must potentially take into account the + semantics of any operation. MLIR addresses this complexity by allowing operation + semantics to be described abstractly using [Traits](Traits.md) and + [Interfaces](Interfaces.md), enabling transformations to operate on operations + more generically. Traits often describe verification constraints on valid IR, + enabling complex invariants to be captured and checked. (see + [Op vs Operation](Tutorials/Toy/Ch-2.md/#op-vs-operation-using-mlir-operations)) + + One obvious application of MLIR is to represent an + [SSA-based](https://en.wikipedia.org/wiki/Static_single_assignment_form) IR, + like the LLVM core IR, with appropriate choice of operation types to define + Modules, Functions, Branches, Memory Allocation, and verification constraints to + ensure the SSA Dominance property. MLIR includes a collection of dialects which + defines just such structures. However, MLIR is intended to be general enough to + represent other compiler-like data structures, such as Abstract Syntax Trees in + a language frontend, generated instructions in a target-specific backend, or + circuits in a High-Level Synthesis tool. + + Here's an example of an MLIR module: + + ```mlir + // Compute A*B using an implementation of multiply kernel and print the + // result using a TensorFlow op. The dimensions of A and B are partially + // known. The shapes are assumed to match. + func @mul(%A: tensor<100x?xf32>, %B: tensor) -> (tensor<100x50xf32>) { + // Compute the inner dimension of %A using the dim operation. + %n = memref.dim %A, 1 : tensor<100x?xf32> + + // Allocate addressable "buffers" and copy tensors %A and %B into them. + %A_m = memref.alloc(%n) : memref<100x?xf32> + memref.tensor_store %A to %A_m : memref<100x?xf32> + + %B_m = memref.alloc(%n) : memref + memref.tensor_store %B to %B_m : memref + + // Call function @multiply passing memrefs as arguments, + // and getting returned the result of the multiplication. + %C_m = call @multiply(%A_m, %B_m) + : (memref<100x?xf32>, memref) -> (memref<100x50xf32>) + + memref.dealloc %A_m : memref<100x?xf32> + memref.dealloc %B_m : memref + + // Load the buffer data into a higher level "tensor" value. + %C = memref.tensor_load %C_m : memref<100x50xf32> + memref.dealloc %C_m : memref<100x50xf32> + + // Call TensorFlow built-in function to print the result tensor. + "tf.Print"(%C){message: "mul result"} + : (tensor<100x50xf32) -> (tensor<100x50xf32>) + + return %C : tensor<100x50xf32> + } + + // A function that multiplies two memrefs and returns the result. + func @multiply(%A: memref<100x?xf32>, %B: memref) + -> (memref<100x50xf32>) { + // Compute the inner dimension of %A. + %n = memref.dim %A, 1 : memref<100x?xf32> + + // Allocate memory for the multiplication result. + %C = memref.alloc() : memref<100x50xf32> + + // Multiplication loop nest. + affine.for %i = 0 to 100 { + affine.for %j = 0 to 50 { + memref.store 0 to %C[%i, %j] : memref<100x50xf32> + affine.for %k = 0 to %n { + %a_v = memref.load %A[%i, %k] : memref<100x?xf32> + %b_v = memref.load %B[%k, %j] : memref + %prod = arith.mulf %a_v, %b_v : f32 + %c_v = memref.load %C[%i, %j] : memref<100x50xf32> + %sum = arith.addf %c_v, %prod : f32 + memref.store %sum, %C[%i, %j] : memref<100x50xf32> + } + } + } + return %C : memref<100x50xf32> + } + ``` + + ## Notation + + MLIR has a simple and unambiguous grammar, allowing it to reliably round-trip + through a textual form. This is important for development of the compiler - e.g. + for understanding the state of code as it is being transformed and writing test + cases. + + This document describes the grammar using + [Extended Backus-Naur Form (EBNF)](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form). + + This is the EBNF grammar used in this document, presented in yellow boxes. + + ``` + alternation ::= expr0 | expr1 | expr2 // Either expr0 or expr1 or expr2. + sequence ::= expr0 expr1 expr2 // Sequence of expr0 expr1 expr2. + repetition0 ::= expr* // 0 or more occurrences. + repetition1 ::= expr+ // 1 or more occurrences. + optionality ::= expr? // 0 or 1 occurrence. + grouping ::= (expr) // Everything inside parens is grouped together. + literal ::= `abcd` // Matches the literal `abcd`. + ``` + + Code examples are presented in blue boxes. + + ```mlir + // This is an example use of the grammar above: + // This matches things like: ba, bana, boma, banana, banoma, bomana... + example ::= `b` (`an` | `om`)* `a` + ``` + + ### Common syntax + + The following core grammar productions are used in this document: + + ``` + // TODO: Clarify the split between lexing (tokens) and parsing (grammar). + digit ::= [0-9] + hex_digit ::= [0-9a-fA-F] + letter ::= [a-zA-Z] + id-punct ::= [$._-] + + integer-literal ::= decimal-literal | hexadecimal-literal + decimal-literal ::= digit+ + hexadecimal-literal ::= `0x` hex_digit+ + float-literal ::= [-+]?[0-9]+[.][0-9]*([eE][-+]?[0-9]+)? + string-literal ::= `"` [^"\n\f\v\r]* `"` TODO: define escaping rules + ``` + + Not listed here, but MLIR does support comments. They use standard BCPL syntax, + starting with a `//` and going until the end of the line. + + + ### Top level Productions + + ``` + // Top level production + toplevel := (operation | attribute-alias-def | type-alias-def)* + ``` + + The production `toplevel` is the top level production that is parsed by any parsing + consuming the MLIR syntax. [Operations](#operations), + [Attribute alises](#attribute-value-aliases), and [Type aliases](#type-aliases) + can be declared on the toplevel. + + ### Identifiers and keywords + + Syntax: + + ``` + // Identifiers + bare-id ::= (letter|[_]) (letter|digit|[_$.])* + bare-id-list ::= bare-id (`,` bare-id)* + value-id ::= `%` suffix-id + suffix-id ::= (digit+ | ((letter|id-punct) (letter|id-punct|digit)*)) + + symbol-ref-id ::= `@` (suffix-id | string-literal) + value-id-list ::= value-id (`,` value-id)* + + // Uses of value, e.g. in an operand list to an operation. + value-use ::= value-id + value-use-list ::= value-use (`,` value-use)* + ``` + + Identifiers name entities such as values, types and functions, and are chosen by + the writer of MLIR code. Identifiers may be descriptive (e.g. `%batch_size`, + `@matmul`), or may be non-descriptive when they are auto-generated (e.g. `%23`, + `@func42`). Identifier names for values may be used in an MLIR text file but are + not persisted as part of the IR - the printer will give them anonymous names + like `%42`. + + MLIR guarantees identifiers never collide with keywords by prefixing identifiers + with a sigil (e.g. `%`, `#`, `@`, `^`, `!`). In certain unambiguous contexts + (e.g. affine expressions), identifiers are not prefixed, for brevity. New + keywords may be added to future versions of MLIR without danger of collision + with existing identifiers. + + Value identifiers are only [in scope](#value-scoping) for the (nested) region in + which they are defined and cannot be accessed or referenced outside of that + region. Argument identifiers in mapping functions are in scope for the mapping + body. Particular operations may further limit which identifiers are in scope in + their regions. For instance, the scope of values in a region with + [SSA control flow semantics](#control-flow-and-ssacfg-regions) is constrained + according to the standard definition of + [SSA dominance](https://en.wikipedia.org/wiki/Dominator_\(graph_theory\)). + Another example is the [IsolatedFromAbove trait](Traits.md/#isolatedfromabove), + which restricts directly accessing values defined in containing regions. + + Function identifiers and mapping identifiers are associated with + [Symbols](SymbolsAndSymbolTables.md) and have scoping rules dependent on symbol + attributes. + + ## Dialects + + Dialects are the mechanism by which to engage with and extend the MLIR + ecosystem. They allow for defining new [operations](#operations), as well as + [attributes](#attributes) and [types](#type-system). Each dialect is given a + unique `namespace` that is prefixed to each defined attribute/operation/type. + For example, the [Affine dialect](Dialects/Affine.md) defines the namespace: + `affine`. + + MLIR allows for multiple dialects, even those outside of the main tree, to + co-exist together within one module. Dialects are produced and consumed by + certain passes. MLIR provides a [framework](DialectConversion.md) to convert + between, and within, different dialects. + + A few of the dialects supported by MLIR: + + * [Affine dialect](Dialects/Affine.md) + * [Func dialect](Dialects/Func.md) + * [GPU dialect](Dialects/GPU.md) + * [LLVM dialect](Dialects/LLVM.md) + * [SPIR-V dialect](Dialects/SPIR-V.md) + * [Vector dialect](Dialects/Vector.md) + + ### Target specific operations + + Dialects provide a modular way in which targets can expose target-specific + operations directly through to MLIR. As an example, some targets go through + LLVM. LLVM has a rich set of intrinsics for certain target-independent + operations (e.g. addition with overflow check) as well as providing access to + target-specific operations for the targets it supports (e.g. vector permutation + operations). LLVM intrinsics in MLIR are represented via operations that start + with an "llvm." name. + + Example: + + ```mlir + // LLVM: %x = call {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) + %x:2 = "llvm.sadd.with.overflow.i16"(%a, %b) : (i16, i16) -> (i16, i1) + ``` + + These operations only work when targeting LLVM as a backend (e.g. for CPUs and + GPUs), and are required to align with the LLVM definition of these intrinsics. + + ## Operations + + Syntax: + + ``` + operation ::= op-result-list? (generic-operation | custom-operation) + trailing-location? + generic-operation ::= string-literal `(` value-use-list? `)` successor-list? + region-list? dictionary-attribute? `:` function-type + custom-operation ::= bare-id custom-operation-format + op-result-list ::= op-result (`,` op-result)* `=` + op-result ::= value-id (`:` integer-literal) + successor-list ::= `[` successor (`,` successor)* `]` + successor ::= caret-id (`:` bb-arg-list)? + region-list ::= `(` region (`,` region)* `)` + dictionary-attribute ::= `{` (attribute-entry (`,` attribute-entry)*)? `}` + trailing-location ::= (`loc` `(` location `)`)? + ``` + + MLIR introduces a uniform concept called *operations* to enable describing many + different levels of abstractions and computations. Operations in MLIR are fully + extensible (there is no fixed list of operations) and have application-specific + semantics. For example, MLIR supports + [target-independent operations](Dialects/MemRef.md), + [affine operations](Dialects/Affine.md), and + [target-specific machine operations](#target-specific-operations). + + The internal representation of an operation is simple: an operation is + identified by a unique string (e.g. `dim`, `tf.Conv2d`, `x86.repmovsb`, + `ppc.eieio`, etc), can return zero or more results, take zero or more operands, + has a dictionary of [attributes](#attributes), has zero or more successors, and + zero or more enclosed [regions](#regions). The generic printing form includes + all these elements literally, with a function type to indicate the types of the + results and operands. + + Example: + + ```mlir + // An operation that produces two results. + // The results of %result can be accessed via the `#` syntax. + %result:2 = "foo_div"() : () -> (f32, i32) + + // Pretty form that defines a unique name for each result. + %foo, %bar = "foo_div"() : () -> (f32, i32) + + // Invoke a TensorFlow function called tf.scramble with two inputs + // and an attribute "fruit". + %2 = "tf.scramble"(%result#0, %bar) {fruit = "banana"} : (f32, i32) -> f32 + ``` + + In addition to the basic syntax above, dialects may register known operations. + This allows those dialects to support *custom assembly form* for parsing and + printing operations. In the operation sets listed below, we show both forms. + + ### Builtin Operations + + The [builtin dialect](Dialects/Builtin.md) defines a select few operations that + are widely applicable by MLIR dialects, such as a universal conversion cast + operation that simplifies inter/intra dialect conversion. This dialect also + defines a top-level `module` operation, that represents a useful IR container. + + ## Blocks + + Syntax: + + ``` + block ::= block-label operation+ + block-label ::= block-id block-arg-list? `:` + block-id ::= caret-id + caret-id ::= `^` suffix-id + value-id-and-type ::= value-id `:` type + + // Non-empty list of names and types. + value-id-and-type-list ::= value-id-and-type (`,` value-id-and-type)* + + block-arg-list ::= `(` value-id-and-type-list? `)` + ``` + + A *Block* is a list of operations. In + [SSACFG regions](#control-flow-and-ssacfg-regions), each block represents a + compiler [basic block](https://en.wikipedia.org/wiki/Basic_block) where + instructions inside the block are executed in order and terminator operations + implement control flow branches between basic blocks. + + A region with a single block may not include a + [terminator operation](#terminator-operations). The enclosing op can opt-out of + this requirement with the `NoTerminator` trait. The top-level `ModuleOp` is an +-example of such operation which defined this trait and whose block body does not ++example of such operation which defines this trait and whose block body does not + have a terminator. + + Blocks in MLIR take a list of block arguments, notated in a function-like way. + Block arguments are bound to values specified by the semantics of individual + operations. Block arguments of the entry block of a region are also arguments to + the region and the values bound to these arguments are determined by the + semantics of the containing operation. Block arguments of other blocks are + determined by the semantics of terminator operations, e.g. Branches, which have + the block as a successor. In regions with + [control flow](#control-flow-and-ssacfg-regions), MLIR leverages this structure + to implicitly represent the passage of control-flow dependent values without the + complex nuances of PHI nodes in traditional SSA representations. Note that + values which are not control-flow dependent can be referenced directly and do + not need to be passed through block arguments. + + Here is a simple example function showing branches, returns, and block + arguments: + + ```mlir + func @simple(i64, i1) -> i64 { + ^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a + cf.cond_br %cond, ^bb1, ^bb2 + + ^bb1: + cf.br ^bb3(%a: i64) // Branch passes %a as the argument + + ^bb2: + %b = arith.addi %a, %a : i64 + cf.br ^bb3(%b: i64) // Branch passes %b as the argument + + // ^bb3 receives an argument, named %c, from predecessors + // and passes it on to bb4 along with %a. %a is referenced + // directly from its defining operation and is not passed through + // an argument of ^bb3. + ^bb3(%c: i64): + cf.br ^bb4(%c, %a : i64, i64) + + ^bb4(%d : i64, %e : i64): + %0 = arith.addi %d, %e : i64 + return %0 : i64 // Return is also a terminator. + } + ``` + + **Context:** The "block argument" representation eliminates a number of special + cases from the IR compared to traditional "PHI nodes are operations" SSA IRs + (like LLVM). For example, the + [parallel copy semantics](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.524.5461&rep=rep1&type=pdf) + of SSA is immediately apparent, and function arguments are no longer a special + case: they become arguments to the entry block + [[more rationale](Rationale/Rationale.md/#block-arguments-vs-phi-nodes)]. Blocks + are also a fundamental concept that cannot be represented by operations because + values defined in an operation cannot be accessed outside the operation. + + ## Regions + + ### Definition + + A region is an ordered list of MLIR [Blocks](#blocks). The semantics within a + region is not imposed by the IR. Instead, the containing operation defines the + semantics of the regions it contains. MLIR currently defines two kinds of + regions: [SSACFG regions](#control-flow-and-ssacfg-regions), which describe + control flow between blocks, and [Graph regions](#graph-regions), which do not + require control flow between block. The kinds of regions within an operation are + described using the [RegionKindInterface](Interfaces.md/#regionkindinterfaces). + + Regions do not have a name or an address, only the blocks contained in a region + do. Regions must be contained within operations and have no type or attributes. + The first block in the region is a special block called the 'entry block'. The + arguments to the entry block are also the arguments of the region itself. The + entry block cannot be listed as a successor of any other block. The syntax for a + region is as follows: + + ``` + region ::= `{` entry-block? block* `}` + entry-block ::= operation+ + ``` + + A function body is an example of a region: it consists of a CFG of blocks and + has additional semantic restrictions that other types of regions may not have. + For example, in a function body, block terminators must either branch to a + different block, or return from a function where the types of the `return` + arguments must match the result types of the function signature. Similarly, the + function arguments must match the types and count of the region arguments. In + general, operations with regions can define these correspondences arbitrarily. + + An *entry block* is a block with no label and no arguments that may occur at + the beginning of a region. It enables a common pattern of using a region to + open a new scope. + + + ### Value Scoping + + Regions provide hierarchical encapsulation of programs: it is impossible to + reference, i.e. branch to, a block which is not in the same region as the source + of the reference, i.e. a terminator operation. Similarly, regions provides a + natural scoping for value visibility: values defined in a region don't escape to + the enclosing region, if any. By default, operations inside a region can + reference values defined outside of the region whenever it would have been legal + for operands of the enclosing operation to reference those values, but this can + be restricted using traits, such as + [OpTrait::IsolatedFromAbove](Traits.md/#isolatedfromabove), or a custom + verifier. + + Example: + + ```mlir + "any_op"(%a) ({ // if %a is in-scope in the containing region... + // then %a is in-scope here too. + %new_value = "another_op"(%a) : (i64) -> (i64) + }) : (i64) -> (i64) + ``` + + MLIR defines a generalized 'hierarchical dominance' concept that operates across + hierarchy and defines whether a value is 'in scope' and can be used by a + particular operation. Whether a value can be used by another operation in the + same region is defined by the kind of region. A value defined in a region can be + used by an operation which has a parent in the same region, if and only if the + parent could use the value. A value defined by an argument to a region can + always be used by any operation deeply contained in the region. A value defined + in a region can never be used outside of the region. + + ### Control Flow and SSACFG Regions + + In MLIR, control flow semantics of a region is indicated by + [RegionKind::SSACFG](Interfaces.md/#regionkindinterfaces). Informally, these + regions support semantics where operations in a region 'execute sequentially'. + Before an operation executes, its operands have well-defined values. After an + operation executes, the operands have the same values and results also have + well-defined values. After an operation executes, the next operation in the + block executes until the operation is the terminator operation at the end of a + block, in which case some other operation will execute. The determination of the + next instruction to execute is the 'passing of control flow'. + + In general, when control flow is passed to an operation, MLIR does not restrict + when control flow enters or exits the regions contained in that operation. + However, when control flow enters a region, it always begins in the first block + of the region, called the *entry* block. Terminator operations ending each block + represent control flow by explicitly specifying the successor blocks of the + block. Control flow can only pass to one of the specified successor blocks as in + a `branch` operation, or back to the containing operation as in a `return` + operation. Terminator operations without successors can only pass control back + to the containing operation. Within these restrictions, the particular semantics + of terminator operations is determined by the specific dialect operations + involved. Blocks (other than the entry block) that are not listed as a successor + of a terminator operation are defined to be unreachable and can be removed + without affecting the semantics of the containing operation. + + Although control flow always enters a region through the entry block, control + flow may exit a region through any block with an appropriate terminator. The + standard dialect leverages this capability to define operations with + Single-Entry-Multiple-Exit (SEME) regions, possibly flowing through different + blocks in the region and exiting through any block with a `return` operation. + This behavior is similar to that of a function body in most programming + languages. In addition, control flow may also not reach the end of a block or + region, for example if a function call does not return. + + Example: + + ```mlir + func @accelerator_compute(i64, i1) -> i64 { // An SSACFG region + ^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a + cf.cond_br %cond, ^bb1, ^bb2 + + ^bb1: + // This def for %value does not dominate ^bb2 + %value = "op.convert"(%a) : (i64) -> i64 + cf.br ^bb3(%a: i64) // Branch passes %a as the argument + + ^bb2: + accelerator.launch() { // An SSACFG region + ^bb0: + // Region of code nested under "accelerator.launch", it can reference %a but + // not %value. + %new_value = "accelerator.do_something"(%a) : (i64) -> () + } + // %new_value cannot be referenced outside of the region + + ^bb3: + ... + } + ``` + + #### Operations with Multiple Regions + + An operation containing multiple regions also completely determines the + semantics of those regions. In particular, when control flow is passed to an + operation, it may transfer control flow to any contained region. When control + flow exits a region and is returned to the containing operation, the containing + operation may pass control flow to any region in the same operation. An + operation may also pass control flow to multiple contained regions concurrently. + An operation may also pass control flow into regions that were specified in + other operations, in particular those that defined the values or symbols the + given operation uses as in a call operation. This passage of control is + generally independent of passage of control flow through the basic blocks of the + containing region. + + #### Closure + + Regions allow defining an operation that creates a closure, for example by + “boxing” the body of the region into a value they produce. It remains up to the + operation to define its semantics. Note that if an operation triggers + asynchronous execution of the region, it is under the responsibility of the + operation caller to wait for the region to be executed guaranteeing that any + directly used values remain live. + + ### Graph Regions + + In MLIR, graph-like semantics in a region is indicated by + [RegionKind::Graph](Interfaces.md/#regionkindinterfaces). Graph regions are + appropriate for concurrent semantics without control flow, or for modeling + generic directed graph data structures. Graph regions are appropriate for + representing cyclic relationships between coupled values where there is no + fundamental order to the relationships. For instance, operations in a graph + region may represent independent threads of control with values representing + streams of data. As usual in MLIR, the particular semantics of a region is + completely determined by its containing operation. Graph regions may only + contain a single basic block (the entry block). + + **Rationale:** Currently graph regions are arbitrarily limited to a single basic + block, although there is no particular semantic reason for this limitation. This + limitation has been added to make it easier to stabilize the pass infrastructure + and commonly used passes for processing graph regions to properly handle + feedback loops. Multi-block regions may be allowed in the future if use cases + that require it arise. + + In graph regions, MLIR operations naturally represent nodes, while each MLIR + value represents a multi-edge connecting a single source node and multiple + destination nodes. All values defined in the region as results of operations are + in scope within the region and can be accessed by any other operation in the + region. In graph regions, the order of operations within a block and the order + of blocks in a region is not semantically meaningful and non-terminator + operations may be freely reordered, for instance, by canonicalization. Other + kinds of graphs, such as graphs with multiple source nodes and multiple + destination nodes, can also be represented by representing graph edges as MLIR + operations. + + Note that cycles can occur within a single block in a graph region, or between + basic blocks. + + ```mlir + "test.graph_region"() ({ // A Graph region + %1 = "op1"(%1, %3) : (i32, i32) -> (i32) // OK: %1, %3 allowed here + %2 = "test.ssacfg_region"() ({ + %5 = "op2"(%1, %2, %3, %4) : (i32, i32, i32, i32) -> (i32) // OK: %1, %2, %3, %4 all defined in the containing region + }) : () -> (i32) + %3 = "op2"(%1, %4) : (i32, i32) -> (i32) // OK: %4 allowed here + %4 = "op3"(%1) : (i32) -> (i32) + }) : () -> () + ``` + + ### Arguments and Results + + The arguments of the first block of a region are treated as arguments of the + region. The source of these arguments is defined by the semantics of the parent + operation. They may correspond to some of the values the operation itself uses. + + Regions produce a (possibly empty) list of values. The operation semantics + defines the relation between the region results and the operation results. + + ## Type System + + Each value in MLIR has a type defined by the type system. MLIR has an open type + system (i.e. there is no fixed list of types), and types may have + application-specific semantics. MLIR dialects may define any number of types + with no restrictions on the abstractions they represent. + + ``` + type ::= type-alias | dialect-type | builtin-type + + type-list-no-parens ::= type (`,` type)* + type-list-parens ::= `(` `)` + | `(` type-list-no-parens `)` + + // This is a common way to refer to a value with a specified type. + ssa-use-and-type ::= ssa-use `:` type + + // Non-empty list of names and types. + ssa-use-and-type-list ::= ssa-use-and-type (`,` ssa-use-and-type)* + ``` + + ### Type Aliases + + ``` + type-alias-def ::= '!' alias-name '=' 'type' type + type-alias ::= '!' alias-name + ``` + + MLIR supports defining named aliases for types. A type alias is an identifier + that can be used in the place of the type that it defines. These aliases *must* + be defined before their uses. Alias names may not contain a '.', since those + names are reserved for [dialect types](#dialect-types). + + Example: + + ```mlir + !avx_m128 = type vector<4 x f32> + + // Using the original type. + "foo"(%x) : vector<4 x f32> -> () + + // Using the type alias. + "foo"(%x) : !avx_m128 -> () + ``` + + ### Dialect Types + + Similarly to operations, dialects may define custom extensions to the type + system. + + ``` + dialect-namespace ::= bare-id + + opaque-dialect-item ::= dialect-namespace '<' string-literal '>' + + pretty-dialect-item ::= dialect-namespace '.' pretty-dialect-item-lead-ident + pretty-dialect-item-body? + + pretty-dialect-item-lead-ident ::= '[A-Za-z][A-Za-z0-9._]*' + pretty-dialect-item-body ::= '<' pretty-dialect-item-contents+ '>' + pretty-dialect-item-contents ::= pretty-dialect-item-body + | '(' pretty-dialect-item-contents+ ')' + | '[' pretty-dialect-item-contents+ ']' + | '{' pretty-dialect-item-contents+ '}' + | '[^[<({>\])}\0]+' + + dialect-type ::= '!' opaque-dialect-item + dialect-type ::= '!' pretty-dialect-item + ``` + + Dialect types can be specified in a verbose form, e.g. like this: + + ```mlir + // LLVM type that wraps around llvm IR types. + !llvm<"i32*"> + + // Tensor flow string type. + !tf.string + + // Complex type + !foo<"something"> + + // Even more complex type + !foo<"something>>"> + ``` + + Dialect types that are simple enough can use the pretty format, which is a + lighter weight syntax that is equivalent to the above forms: + + ```mlir + // Tensor flow string type. + !tf.string + + // Complex type + !foo.something + ``` + + Sufficiently complex dialect types are required to use the verbose form for + generality. For example, the more complex type shown above wouldn't be valid in + the lighter syntax: `!foo.something>>` because it contains characters + that are not allowed in the lighter syntax, as well as unbalanced `<>` + characters. + + See [here](AttributesAndTypes.md) to learn how to define dialect types. + + ### Builtin Types + + The [builtin dialect](Dialects/Builtin.md) defines a set of types that are + directly usable by any other dialect in MLIR. These types cover a range from + primitive integer and floating-point types, function types, and more. + + ## Attributes + + Syntax: + + ``` + attribute-entry ::= (bare-id | string-literal) `=` attribute-value + attribute-value ::= attribute-alias | dialect-attribute | builtin-attribute + ``` + + Attributes are the mechanism for specifying constant data on operations in + places where a variable is never allowed - e.g. the comparison predicate of a + [`cmpi` operation](Dialects/ArithmeticOps.md#arithcmpi-mlirarithcmpiop). Each operation has an + attribute dictionary, which associates a set of attribute names to attribute + values. MLIR's builtin dialect provides a rich set of + [builtin attribute values](#builtin-attribute-values) out of the box (such as + arrays, dictionaries, strings, etc.). Additionally, dialects can define their + own [dialect attribute values](#dialect-attribute-values). + + The top-level attribute dictionary attached to an operation has special + semantics. The attribute entries are considered to be of two different kinds + based on whether their dictionary key has a dialect prefix: + + - *inherent attributes* are inherent to the definition of an operation's + semantics. The operation itself is expected to verify the consistency of + these attributes. An example is the `predicate` attribute of the + `arith.cmpi` op. These attributes must have names that do not start with a + dialect prefix. + + - *discardable attributes* have semantics defined externally to the operation +- itself, but must be compatible with the operations's semantics. These ++ itself, but must be compatible with the operations' semantics. These + attributes must have names that start with a dialect prefix. The dialect + indicated by the dialect prefix is expected to verify these attributes. An + example is the `gpu.container_module` attribute. + + Note that attribute values are allowed to themselves be dictionary attributes, + but only the top-level dictionary attribute attached to the operation is subject + to the classification above. + + ### Attribute Value Aliases + + ``` + attribute-alias-def ::= '#' alias-name '=' attribute-value + attribute-alias ::= '#' alias-name + ``` + + MLIR supports defining named aliases for attribute values. An attribute alias is + an identifier that can be used in the place of the attribute that it defines. + These aliases *must* be defined before their uses. Alias names may not contain a + '.', since those names are reserved for + [dialect attributes](#dialect-attribute-values). + + Example: + + ```mlir + #map = affine_map<(d0) -> (d0 + 10)> + + // Using the original attribute. + %b = affine.apply affine_map<(d0) -> (d0 + 10)> (%a) + + // Using the attribute alias. + %b = affine.apply #map(%a) + ``` + + ### Dialect Attribute Values + + Similarly to operations, dialects may define custom attribute values. The + syntactic structure of these values is identical to custom dialect type values, + except that dialect attribute values are distinguished with a leading '#', while + dialect types are distinguished with a leading '!'. + + ``` + dialect-attribute-value ::= '#' opaque-dialect-item + dialect-attribute-value ::= '#' pretty-dialect-item + ``` + + Dialect attribute values can be specified in a verbose form, e.g. like this: + + ```mlir + // Complex attribute value. + #foo<"something"> + + // Even more complex attribute value. + #foo<"something>>"> + ``` + + Dialect attribute values that are simple enough can use the pretty format, which + is a lighter weight syntax that is equivalent to the above forms: + + ```mlir + // Complex attribute + #foo.something + ``` + + Sufficiently complex dialect attribute values are required to use the verbose + form for generality. For example, the more complex type shown above would not be + valid in the lighter syntax: `#foo.something>>` because it contains + characters that are not allowed in the lighter syntax, as well as unbalanced + `<>` characters. + +-See [here](AttributesAndTypes.md) on how to define dialect attribute values. ++See [here](AttributesAndTypes.md) to learn how to define dialect attribute values. + + ### Builtin Attribute Values + + The [builtin dialect](Dialects/Builtin.md) defines a set of attribute values + that are directly usable by any other dialect in MLIR. These types cover a range + from primitive integer and floating-point values, attribute dictionaries, dense + multi-dimensional arrays, and more. diff --git a/mlir/docs/OpDefinitions.md b/mlir/docs/OpDefinitions.md --- a/mlir/docs/OpDefinitions.md +++ b/mlir/docs/OpDefinitions.md @@ -23,7 +23,7 @@ problem, e.g., repetitive string comparisons during optimization and analysis passes, unintuitive accessor methods (e.g., generic/error prone `getOperand(3)` vs self-documenting `getStride()`) with more generic return types, verbose and -generic constructors without default arguments, verbose textual IR dump, and so +generic constructors without default arguments, verbose textual IR dumps, and so on. Furthermore, operation verification is: 1. best case: a central string-to-verification-function map, @@ -57,7 +57,7 @@ We use TableGen as the language for specifying operation information. TableGen itself just provides syntax for writing records; the syntax and constructs -allowed in a TableGen file (typically with filename suffix `.td`) can be found +allowed in a TableGen file (typically with the filename suffix `.td`) can be found [here][TableGenProgRef]. * TableGen `class` is similar to C++ class; it can be templated and @@ -80,7 +80,7 @@ MLIR defines several common constructs to help operation definition and provide their semantics via a special [TableGen backend][TableGenBackend]: [`OpDefinitionsGen`][OpDefinitionsGen]. These constructs are defined in -[`OpBase.td`][OpBase]. The main ones are +[`OpBase.td`][OpBase]. The main ones are: * The `Op` class: It is the main construct for defining operations. All facts regarding the operation are specified when specializing this class, with the @@ -91,7 +91,7 @@ and constraints of the operation, including whether the operation has side effect or whether its output has the same shape as the input. * The `ins`/`outs` marker: These are two special markers builtin to the - `OpDefinitionsGen` backend. They lead the definitions of operands/attributes + `OpDefinitionsGen` backend. They lead to the definitions of operands/attributes and results respectively. * The `TypeConstraint` class hierarchy: They are used to specify the constraints over operands or results. A notable subclass hierarchy is @@ -134,7 +134,7 @@ ### Operation name -The operation name is a unique identifier of the operation within MLIR, e.g., +The operation name is a unique identifier for the operation within MLIR, e.g., `tf.Add` for addition operation in the TensorFlow dialect. This is the equivalent of the mnemonic in assembly language. It is used for parsing and printing in the textual format. It is also used for pattern matching in graph @@ -207,12 +207,13 @@ the return type (in the case of attributes the return type will be constructed from the storage type, while for operands it will be `Value`). Each attribute's raw value (e.g., as stored) can also be accessed via generated `Attr` -getters for use in transformation passes where the more user friendly return +getters for use in transformation passes where the more user-friendly return type is less suitable. -All the arguments should be named to 1) provide documentation, 2) drive -auto-generation of getter methods, 3) provide a handle to reference for other -places like constraints. +All the arguments should be named to: +- provide documentation, +- drive auto-generation of getter methods, and +- provide a handle to reference for other places like constraints. #### Variadic operands @@ -221,7 +222,7 @@ Normally operations have no variadic operands or just one variadic operand. For the latter case, it is easy to deduce which dynamic operands are for the static -variadic operand definition. Though, if an operation has more than one variable +variadic operand definition. However, if an operation has more than one variable length operands (either optional or variadic), it would be impossible to attribute dynamic operands to the corresponding static variadic operand definitions without further information from the operation. Therefore, either @@ -247,7 +248,7 @@ Normally operations have no optional operands or just one optional operand. For the latter case, it is easy to deduce which dynamic operands are for the static -operand definition. Though, if an operation has more than one variable length +operand definition. However, if an operation has more than one variable length operands (either optional or variadic), it would be impossible to attribute dynamic operands to the corresponding static variadic operand definitions without further information from the operation. Therefore, either the @@ -425,7 +426,7 @@ same form regardless of the exact op. This is particularly useful for implementing declarative pattern rewrites. -The second and third forms are good for use in manually written code given that +The second and third forms are good for use in manually written code, given that they provide better guarantee via signatures. The third form will be generated if any of the op's attribute has different @@ -434,14 +435,14 @@ Additionally, for the third form, if an attribute appearing later in the `arguments` list has a default value, the default value will be supplied in the declaration. This works for `BoolAttr`, `StrAttr`, `EnumAttr` for now and the -list can grow in the future. So if possible, default valued attribute should be +list can grow in the future. So if possible, the default-valued attribute should be placed at the end of the `arguments` list to leverage this feature. (This behavior is essentially due to C++ function parameter default value placement restrictions.) Otherwise, the builder of the third form will still be generated but default values for the attributes not at the end of the `arguments` list will not be supplied in the builder's signature. -ODS will generate a builder that doesn't require return type specified if +ODS will generate a builder that doesn't require the return type specified if * Op implements InferTypeOpInterface interface; * All return types are either buildable types or are the same as a given @@ -581,18 +582,18 @@ The verification of an operation involves several steps, 1. StructuralOpTrait will be verified first, they can be run independently. -1. `verifyInvariants` which is constructed by ODS, it verifies the type, +2. `verifyInvariants` which is constructed by ODS, it verifies the type, attributes, .etc. -1. Other Traits/Interfaces that have marked their verifier as `verifyTrait` or +3. Other Traits/Interfaces that have marked their verifier as `verifyTrait` or `verifyWithRegions=0`. -1. Custom verifier which is defined in the op and has marked `hasVerifier=1` +4. Custom verifier which is defined in the op and has been marked `hasVerifier=1` If an operation has regions, then it may have the second phase, 1. Traits/Interfaces that have marked their verifier as `verifyRegionTrait` or `verifyWithRegions=1`. This implies the verifier needs to access the operations in its regions. -1. Custom verifier which is defined in the op and has marked +2. Custom verifier which is defined in the op and has been marked `hasRegionVerifier=1` Note that the second phase will be run after the operations in the region are diff --git a/mlir/docs/OpDefinitions.md.rej b/mlir/docs/OpDefinitions.md.rej new file mode 100644 --- /dev/null +++ b/mlir/docs/OpDefinitions.md.rej @@ -0,0 +1,1633 @@ +diff a/mlir/docs/OpDefinitions.md b/mlir/docs/OpDefinitions.md (rejected hunks) +@@ -1,1612 +1,1613 @@ + # Operation Definition Specification (ODS) + + In addition to specializing the `mlir::Op` C++ template, MLIR also supports + defining operations and data types in a table-driven manner. This is achieved + via [TableGen][TableGen], which is both a generic language and its tooling to + maintain records of domain-specific information. Facts regarding an operation + are specified concisely into a TableGen record, which will be expanded into an + equivalent `mlir::Op` C++ template specialization at compiler build time. + + This manual explains in detail all the available mechanisms for defining + operations in such a table-driven manner. It aims to be a specification instead + of a tutorial. Please refer to + [Quickstart tutorial to adding MLIR graph rewrite](Tutorials/QuickstartRewrites.md) + for the latter. + + In addition to detailing each mechanism, this manual also tries to capture best + practices. They are rendered as quoted bullet points. + + ## Motivation + + MLIR allows pluggable dialects, and dialects contain, among others, a list of + operations. This open and extensible ecosystem leads to the "stringly" type IR + problem, e.g., repetitive string comparisons during optimization and analysis + passes, unintuitive accessor methods (e.g., generic/error prone `getOperand(3)` + vs self-documenting `getStride()`) with more generic return types, verbose and +-generic constructors without default arguments, verbose textual IR dump, and so ++generic constructors without default arguments, verbose textual IR dumps, and so + on. Furthermore, operation verification is: + + 1. best case: a central string-to-verification-function map, + 1. middle case: duplication of verification across the code base, or + 1. worst case: no verification functions. + + The fix is to support defining ops in a table-driven manner. Then for each + dialect, we can have a central place that contains everything you need to know + about each op, including its constraints, custom assembly form, etc. This + description is also used to generate helper functions and classes to allow + building, verification, parsing, printing, analysis, and many more. + + ## Benefits + + Compared to the C++ template, this table-driven approach has several benefits + including but not limited to: + + * **Single source of truth**: We strive to encode all facts regarding an + operation into the record, so that readers don't need to jump among code + snippets to fully understand an operation. + * **Removing boilerplate**: We can automatically generate + operand/attribute/result getter methods, operation build methods, operation + verify methods, and many more utilities from the record. This greatly + reduces the boilerplate needed for defining a new op. + * **Facilitating auto-generation**: The usage of these operation information + records are by no means limited to op definition itself. We can use them to + drive the auto-generation of many other components, like computation graph + serialization. + + ## TableGen Syntax + + We use TableGen as the language for specifying operation information. TableGen + itself just provides syntax for writing records; the syntax and constructs +-allowed in a TableGen file (typically with filename suffix `.td`) can be found ++allowed in a TableGen file (typically with the filename suffix `.td`) can be found + [here][TableGenProgRef]. + + * TableGen `class` is similar to C++ class; it can be templated and + subclassed. + * TableGen `def` is similar to C++ object; it can be declared by specializing + a TableGen `class` (e.g., `def MyDef : MyClass<...>;`) or completely + independently (e.g., `def MyDef;`). It cannot be further templated or + subclassed. + * TableGen `dag` is a dedicated type for directed acyclic graph of elements. A + `dag` has one operator and zero or more arguments. Its syntax is `(operator + arg0, arg1, argN)`. The operator can be any TableGen `def`; an argument can + be anything, including `dag` itself. We can have names attached to both the + operator and the arguments like `(MyOp:$op_name MyArg:$arg_name)`. + + Please see the [language reference][TableGenProgRef] to learn about all the + types and expressions supported by TableGen. + + ## Operation Definition + + MLIR defines several common constructs to help operation definition and provide + their semantics via a special [TableGen backend][TableGenBackend]: + [`OpDefinitionsGen`][OpDefinitionsGen]. These constructs are defined in +-[`OpBase.td`][OpBase]. The main ones are ++[`OpBase.td`][OpBase]. The main ones are: + + * The `Op` class: It is the main construct for defining operations. All facts + regarding the operation are specified when specializing this class, with the + help of the following constructs. + * The `Dialect` class: Operations belonging to one logical group are placed in + the same dialect. The `Dialect` class contains dialect-level information. + * The `OpTrait` class hierarchy: They are used to specify special properties + and constraints of the operation, including whether the operation has side + effect or whether its output has the same shape as the input. + * The `ins`/`outs` marker: These are two special markers builtin to the +- `OpDefinitionsGen` backend. They lead the definitions of operands/attributes ++ `OpDefinitionsGen` backend. They lead to the definitions of operands/attributes + and results respectively. + * The `TypeConstraint` class hierarchy: They are used to specify the + constraints over operands or results. A notable subclass hierarchy is + `Type`, which stands for constraints for common C++ types. + * The `AttrConstraint` class hierarchy: They are used to specify the + constraints over attributes. A notable subclass hierarchy is `Attr`, which + stands for constraints for attributes whose values are of common types. + + An operation is defined by specializing the `Op` class with concrete contents + for all the fields it requires. For example, `tf.AvgPool` is defined as + + ```tablegen + def TF_AvgPoolOp : TF_Op<"AvgPool", [NoSideEffect]> { + let summary = "Performs average pooling on the input."; + + let description = [{ + Each entry in `output` is the mean of the corresponding size `ksize` + window in `value`. + }]; + + let arguments = (ins + TF_FpTensor:$value, + + Confined]>:$ksize, + Confined]>:$strides, + TF_AnyStrAttrOf<["SAME", "VALID"]>:$padding, + DefaultValuedAttr:$data_format + ); + + let results = (outs + TF_FpTensor:$output + ); + + TF_DerivedOperandTypeAttr T = TF_DerivedOperandTypeAttr<0>; + } + ``` + + In the following we describe all the fields needed. Please see the definition of + the `Op` class for the complete list of fields supported. + + ### Operation name + +-The operation name is a unique identifier of the operation within MLIR, e.g., ++The operation name is a unique identifier for the operation within MLIR, e.g., + `tf.Add` for addition operation in the TensorFlow dialect. This is the + equivalent of the mnemonic in assembly language. It is used for parsing and + printing in the textual format. It is also used for pattern matching in graph + rewrites. + + The full operation name is composed of the dialect name and the op name, with + the former provided via the dialect and the latter provided as the second + template parameter to the `Op` class. + + ### Operation documentation + + This includes both a one-line `summary` and a longer human-readable + `description`. They will be used to drive automatic generation of dialect + documentation. They need to be provided in the operation's definition body: + + ```tablegen + let summary = "..."; + + let description = [{ + ... + }]; + ``` + + `description` should be written in Markdown syntax. + + Placing the documentation at the beginning is recommended since it helps in + understanding the operation. + + > * Place documentation at the beginning of the operation definition + > * The summary should be short and concise. It should be a one-liner without + > trailing punctuation. Put expanded explanation in description. + + ### Operation arguments + + There are two kinds of arguments: operands and attributes. Operands are runtime + values produced by other ops; while attributes are compile-time known constant + values, including two categories: + + 1. Natural attributes: these attributes affect the behavior of the operations + (e.g., padding for convolution); + 1. Derived attributes: these attributes are not needed to define the operation + but are instead derived from information of the operation. E.g., the output + shape of type. This is mostly used for convenience interface generation or + interaction with other frameworks/translation. + + All derived attributes should be materializable as an Attribute. That is, + even though they are not materialized, it should be possible to store as an + attribute. + + Both operands and attributes are specified inside the `dag`-typed `arguments`, + led by `ins`: + + ```tablegen + let arguments = (ins + :$, + ... + :$, + ... + ); + ``` + + Here `` is a TableGen `def` from the `TypeConstraint` class + hierarchy. Similarly, `` is a TableGen `def` from the + `AttrConstraint` class hierarchy. See [Constraints](#constraints) for more + information. + + There is no requirements on the relative order of operands and attributes; they + can mix freely. The relative order of operands themselves matters. From each + named argument a named getter will be generated that returns the argument with + the return type (in the case of attributes the return type will be constructed + from the storage type, while for operands it will be `Value`). Each attribute's + raw value (e.g., as stored) can also be accessed via generated `Attr` +-getters for use in transformation passes where the more user friendly return ++getters for use in transformation passes where the more user-friendly return + type is less suitable. + +-All the arguments should be named to 1) provide documentation, 2) drive +-auto-generation of getter methods, 3) provide a handle to reference for other +-places like constraints. ++All the arguments should be named to: ++- provide documentation, ++- drive auto-generation of getter methods, and ++- provide a handle to reference for other places like constraints. + + #### Variadic operands + + To declare a variadic operand, wrap the `TypeConstraint` for the operand with + `Variadic<...>`. + + Normally operations have no variadic operands or just one variadic operand. For + the latter case, it is easy to deduce which dynamic operands are for the static +-variadic operand definition. Though, if an operation has more than one variable ++variadic operand definition. However, if an operation has more than one variable + length operands (either optional or variadic), it would be impossible to + attribute dynamic operands to the corresponding static variadic operand + definitions without further information from the operation. Therefore, either + the `SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to + indicate that all variable length operands have the same number of dynamic + values. + + #### VariadicOfVariadic operands + + To declare a variadic operand that has a variadic number of sub-ranges, wrap the + `TypeConstraint` for the operand with `VariadicOfVariadic<..., + "">`. + + The second field of the `VariadicOfVariadic` is the name of an `I32ElementsAttr` + argument that contains the sizes of the variadic sub-ranges. This attribute will + be used when determining the size of sub-ranges, or when updating the size of + sub-ranges. + + #### Optional operands + + To declare an optional operand, wrap the `TypeConstraint` for the operand with + `Optional<...>`. + + Normally operations have no optional operands or just one optional operand. For + the latter case, it is easy to deduce which dynamic operands are for the static +-operand definition. Though, if an operation has more than one variable length ++operand definition. However, if an operation has more than one variable length + operands (either optional or variadic), it would be impossible to attribute + dynamic operands to the corresponding static variadic operand definitions + without further information from the operation. Therefore, either the + `SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to + indicate that all variable length operands have the same number of dynamic + values. + + #### Optional attributes + + To declare an optional attribute, wrap the `AttrConstraint` for the attribute + with `OptionalAttr<...>`. + + #### Attributes with default values + + To declare an attribute with a default value, wrap the `AttrConstraint` for the + attribute with `DefaultValuedAttr<..., "...">`. + + The second parameter to `DefaultValuedAttr` should be a string containing the + C++ default value. For example, a float default value should be specified as + like `"0.5f"`, and an integer array default value should be specified as like + `"{1, 2, 3}"`. + + #### Confining attributes + + `Confined` is provided as a general mechanism to help modelling further + constraints on attributes beyond the ones brought by value types. You can use + `Confined` to compose complex constraints out of more primitive ones. For + example, a 32-bit integer attribute whose minimum value must be 10 can be + expressed as `Confined]>`. + + Right now, the following primitive constraints are supported: + + * `IntMinValue`: Specifying an integer attribute to be greater than or + equal to `N` + * `IntMaxValue`: Specifying an integer attribute to be less than or equal + to `N` + * `ArrayMinCount`: Specifying an array attribute to have at least `N` + elements + * `IntArrayNthElemEq`: Specifying an integer array attribute's `I`-th + element to be equal to `N` + * `IntArrayNthElemMinValue`: Specifying an integer array attribute's + `I`-th element to be greater than or equal to `N` + + TODO: Design and implement more primitive constraints + + ### Operation regions + + The regions of an operation are specified inside of the `dag`-typed `regions`, + led by `region`: + + ```tablegen + let regions = (region + :$, + ... + ); + ``` + + #### Variadic regions + + Similar to the `Variadic` class used for variadic operands and results, + `VariadicRegion<...>` can be used for regions. Variadic regions can currently + only be specified as the last region in the regions list. + + ### Operation results + + Similar to operands, results are specified inside the `dag`-typed `results`, led + by `outs`: + + ```tablegen + let results = (outs + :$, + ... + ); + ``` + + #### Variadic results + + Similar to variadic operands, `Variadic<...>` can also be used for results. And + similarly, `SameVariadicResultSize` for multiple variadic results in the same + operation. + + ### Operation successors + + For terminator operations, the successors are specified inside of the + `dag`-typed `successors`, led by `successor`: + + ```tablegen + let successors = (successor + :$, + ... + ); + ``` + + #### Variadic successors + + Similar to the `Variadic` class used for variadic operands and results, + `VariadicSuccessor<...>` can be used for successors. Variadic successors can + currently only be specified as the last successor in the successor list. + + ### Operation traits and constraints + + Traits are operation properties that affect syntax or semantics. MLIR C++ models + various traits in the `mlir::OpTrait` namespace. + + Both operation traits, [interfaces](Interfaces.md/#utilizing-the-ods-framework), + and constraints involving multiple operands/attributes/results are provided as + the third template parameter to the `Op` class. They should be deriving from + the `OpTrait` class. See [Constraints](#constraints) for more information. + + ### Builder methods + + For each operation, there are a few builders automatically generated based on + the arguments and returns types. For example, given the following op definition: + + ```tablegen + def MyOp : ... { + let arguments = (ins + I32:$i32_operand, + F32:$f32_operand, + ..., + + I32Attr:$i32_attr, + F32Attr:$f32_attr, + ... + ); + + let results = (outs + I32:$i32_result, + F32:$f32_result, + ... + ); + } + ``` + + The following builders are generated: + + ```c++ + // All result-types/operands/attributes have one aggregate parameter. + static void build(OpBuilder &odsBuilder, OperationState &odsState, + ArrayRef resultTypes, + ValueRange operands, + ArrayRef attributes); + + // Each result-type/operand/attribute has a separate parameter. The parameters + // for attributes are of mlir::Attribute types. + static void build(OpBuilder &odsBuilder, OperationState &odsState, + Type i32_result, Type f32_result, ..., + Value i32_operand, Value f32_operand, ..., + IntegerAttr i32_attr, FloatAttr f32_attr, ...); + + // Each result-type/operand/attribute has a separate parameter. The parameters + // for attributes are raw values unwrapped with mlir::Attribute instances. + // (Note that this builder will not always be generated. See the following + // explanation for more details.) + static void build(OpBuilder &odsBuilder, OperationState &odsState, + Type i32_result, Type f32_result, ..., + Value i32_operand, Value f32_operand, ..., + APInt i32_attr, StringRef f32_attr, ...); + + // Each operand/attribute has a separate parameter but result type is aggregate. + static void build(OpBuilder &odsBuilder, OperationState &odsState, + ArrayRef resultTypes, + Value i32_operand, Value f32_operand, ..., + IntegerAttr i32_attr, FloatAttr f32_attr, ...); + + // All operands/attributes have aggregate parameters. + // Generated if return type can be inferred. + static void build(OpBuilder &odsBuilder, OperationState &odsState, + ValueRange operands, ArrayRef attributes); + + // (And manually specified builders depending on the specific op.) + ``` + + The first form provides basic uniformity so that we can create ops using the + same form regardless of the exact op. This is particularly useful for + implementing declarative pattern rewrites. + +-The second and third forms are good for use in manually written code given that ++The second and third forms are good for use in manually written code, given that + they provide better guarantee via signatures. + + The third form will be generated if any of the op's attribute has different + `Attr.returnType` from `Attr.storageType` and we know how to build an attribute + from an unwrapped value (i.e., `Attr.constBuilderCall` is defined.) + Additionally, for the third form, if an attribute appearing later in the + `arguments` list has a default value, the default value will be supplied in the + declaration. This works for `BoolAttr`, `StrAttr`, `EnumAttr` for now and the +-list can grow in the future. So if possible, default valued attribute should be ++list can grow in the future. So if possible, the default-valued attribute should be + placed at the end of the `arguments` list to leverage this feature. (This + behavior is essentially due to C++ function parameter default value placement + restrictions.) Otherwise, the builder of the third form will still be generated + but default values for the attributes not at the end of the `arguments` list + will not be supplied in the builder's signature. + +-ODS will generate a builder that doesn't require return type specified if ++ODS will generate a builder that doesn't require the return type specified if + + * Op implements InferTypeOpInterface interface; + * All return types are either buildable types or are the same as a given + operand (e.g., `AllTypesMatch` constraint between operand and result); + + And there may potentially exist other builders depending on the specific op; + please refer to the + [generated C++ file](#run-mlir-tblgen-to-see-the-generated-content) for the + complete list. + + #### Custom builder methods + + However, if the above cases cannot satisfy all needs, you can define additional + convenience build methods in the `builders` field as follows. + + ```tablegen + def MyOp : Op<"my_op", []> { + let arguments = (ins F32Attr:$attr); + + let builders = [ + OpBuilder<(ins "float":$val)> + ]; + } + ``` + + The `builders` field is a list of custom builders that are added to the Op + class. In this example, we provide a convenience builder that takes a floating + point value instead of an attribute. The `ins` prefix is common to many function + declarations in ODS, which use a TableGen [`dag`](#tablegen-syntax). What + follows is a comma-separated list of types (quoted string) and names prefixed + with the `$` sign. This will generate the declaration of a builder method that + looks like: + + ```c++ + class MyOp : /*...*/ { + /*...*/ + static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state, + float val); + }; + ``` + + Note that the method has two additional leading arguments. These arguments are + useful to construct the operation. In particular, the method must populate + `state` with attributes, operands, regions and result types of the operation to + be constructed. `builder` can be used to construct any IR objects that belong to + the Op, such as types or nested operations. Since the type and name are + generated as is in the C++ code, they should be valid C++ constructs for a type + (in the namespace of the Op) and an identifier (e.g., `class` is not a valid + identifier). + + Implementations of the builder can be provided directly in ODS, using TableGen + code block as follows. + + ```tablegen + def MyOp : Op<"my_op", []> { + let arguments = (ins F32Attr:$attr); + + let builders = [ + OpBuilder<(ins "float":$val), [{ + $_state.addAttribute("attr", $_builder.getF32FloatAttr(val)); + }]> + ]; + } + ``` + + The equivalents of `builder` and `state` arguments are available as `$_builder` + and `$_state` special variables. The named arguments listed in the `ins` part + are available directly, e.g. `val`. The body of the builder will be generated by + substituting special variables and should otherwise be valid C++. While there is + no limitation on the code size, we encourage one to define only short builders + inline in ODS and put definitions of longer builders in C++ files. + + Finally, if some arguments need a default value, they can be defined using + `CArg` to wrap the type and this value as follows. + + ```tablegen + def MyOp : Op<"my_op", []> { + let arguments = (ins F32Attr:$attr); + + let builders = [ + OpBuilder<(ins CArg<"float", "0.5f">:$val), [{ + $_state.addAttribute("attr", $_builder.getF32FloatAttr(val)); + }]> + ]; + } + ``` + + The generated code will use default value in the declaration, but not in the + definition, as required by C++. + + ```c++ + /// Header file. + class MyOp : /*...*/ { + /*...*/ + static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state, + float val = 0.5f); + }; + + /// Source file. + MyOp::build(::mlir::OpBuilder &builder, ::mlir::OperationState &state, + float val) { + state.addAttribute("attr", builder.getF32FloatAttr(val)); + } + ``` + + **Deprecated:** `OpBuilder` class allows one to specify the custom builder + signature as a raw string, without separating parameters into different `dag` + arguments. It also supports leading parameters of `OpBuilder &` and + `OperationState &` types, which will be used instead of the autogenerated ones + if present. + + ### Custom parser and printer methods + + Functions to parse and print the operation's custom assembly form. + + ### Custom verifier code + + Verification code will be automatically generated for + [constraints](#constraints) specified on various entities of the op. To perform + _additional_ verification, you can use + + ```tablegen + let hasVerifier = 1; + let hasRegionVerifier = 1; + ``` + + This will generate `LogicalResult verify()`/`LogicalResult verifyRegions()` + method declarations on the op class that can be defined with any additional + verification constraints. For verificaiton which needs to access the nested + operations, you should use `hasRegionVerifier` to ensure that it won't access + any ill-formed operation. Except that, The other verifications can be + implemented with `hasVerifier`. Check the next section for the execution order + of these verification methods. + + #### Verification Ordering + + The verification of an operation involves several steps, + + 1. StructuralOpTrait will be verified first, they can be run independently. +-1. `verifyInvariants` which is constructed by ODS, it verifies the type, ++2. `verifyInvariants` which is constructed by ODS, it verifies the type, + attributes, .etc. +-1. Other Traits/Interfaces that have marked their verifier as `verifyTrait` or ++3. Other Traits/Interfaces that have marked their verifier as `verifyTrait` or + `verifyWithRegions=0`. +-1. Custom verifier which is defined in the op and has marked `hasVerifier=1` ++4. Custom verifier which is defined in the op and has been marked `hasVerifier=1` + + If an operation has regions, then it may have the second phase, + + 1. Traits/Interfaces that have marked their verifier as `verifyRegionTrait` or + `verifyWithRegions=1`. This implies the verifier needs to access the + operations in its regions. +-1. Custom verifier which is defined in the op and has marked ++2. Custom verifier which is defined in the op and has been marked + `hasRegionVerifier=1` + + Note that the second phase will be run after the operations in the region are + verified. Verifiers further down the order can rely on certain invariants being + verified by a previous verifier and do not need to re-verify them. + + #### Emitting diagnostics in custom verifiers + + Custom verifiers should avoid printing operations using custom operation + printers, because they require the printed operation (and sometimes its parent + operation) to be verified first. In particular, when emitting diagnostics, + custom verifiers should use the `Error` severity level, which prints operations + in generic form by default, and avoid using lower severity levels (`Note`, + `Remark`, `Warning`). + + ### Declarative Assembly Format + + The custom assembly form of the operation may be specified in a declarative + string that matches the operations operands, attributes, etc. With the ability + to express additional information that needs to be parsed to build the + operation: + + ```tablegen + def CallOp : Std_Op<"call", ...> { + let arguments = (ins FlatSymbolRefAttr:$callee, Variadic:$args); + let results = (outs Variadic); + + let assemblyFormat = [{ + $callee `(` $args `)` attr-dict `:` functional-type($args, results) + }]; + } + ``` + + The format is comprised of three components: + + #### Directives + + A directive is a type of builtin function, with an optional set of arguments. + The available directives are as follows: + + * `attr-dict` + + - Represents the attribute dictionary of the operation. + + * `attr-dict-with-keyword` + + - Represents the attribute dictionary of the operation, but prefixes the + dictionary with an `attributes` keyword. + + * `custom` < UserDirective > ( Params ) + + - Represents a custom directive implemented by the user in C++. + - See the [Custom Directives](#custom-directives) section below for more + details. + + * `functional-type` ( inputs , results ) + + - Formats the `inputs` and `results` arguments as a + [function type](Dialects/Builtin.md/#functiontype). + - The constraints on `inputs` and `results` are the same as the `input` of + the `type` directive. + + * `oilist` ( \`keyword\` elements | \`otherKeyword\` elements ...) + + - Represents an optional order-independent list of clauses. Each clause + has a keyword and corresponding assembly format. + - Each clause can appear 0 or 1 time (in any order). + - Only literals, types and variables can be used within an oilist element. + - All the variables must be optional or variadic. + + * `operands` + + - Represents all of the operands of an operation. + + * `ref` ( input ) + + - Represents a reference to the a variable or directive, that must have + already been resolved, that may be used as a parameter to a `custom` + directive. + - Used to pass previously parsed entities to custom directives. + - The input may be any directive or variable, aside from `functional-type` + and `custom`. + + * `regions` + + - Represents all of the regions of an operation. + + * `results` + + - Represents all of the results of an operation. + + * `successors` + + - Represents all of the successors of an operation. + + * `type` ( input ) + + - Represents the type of the given input. + - `input` must be either an operand or result [variable](#variables), the + `operands` directive, or the `results` directive. + + * `qualified` ( type_or_attribute ) + + - Wraps a `type` directive or an attribute parameter. + - Used to force printing the type or attribute prefixed with its dialect + and mnemonic. For example the `vector.multi_reduction` operation has a + `kind` attribute ; by default the declarative assembly will print: + `vector.multi_reduction , ...` but using `qualified($kind)` in the + declarative assembly format will print it instead as: + `vector.multi_reduction #vector.kind, ...`. + + #### Literals + + A literal is either a keyword or punctuation surrounded by \`\`. + + The following are the set of valid punctuation: + + `:`, `,`, `=`, `<`, `>`, `(`, `)`, `{`, `}`, `[`, `]`, `->`, `?`, `+`, `*` + + The following are valid whitespace punctuation: + + `\n`, ` ` + + The `\n` literal emits a newline an indents to the start of the operation. An + example is shown below: + + ```tablegen + let assemblyFormat = [{ + `{` `\n` ` ` ` ` `this_is_on_a_newline` `\n` `}` attr-dict + }]; + ``` + + ```mlir + %results = my.operation { + this_is_on_a_newline + } + ``` + + An empty literal \`\` may be used to remove a space that is inserted implicitly + after certain literal elements, such as `)`/`]`/etc. For example, "`]`" may + result in an output of `]` it is not the last element in the format. "`]` \`\`" + would trim the trailing space in this situation. + + #### Variables + + A variable is an entity that has been registered on the operation itself, i.e. + an argument(attribute or operand), region, result, successor, etc. In the + `CallOp` example above, the variables would be `$callee` and `$args`. + + Attribute variables are printed with their respective value type, unless that + value type is buildable. In those cases, the type of the attribute is elided. + + #### Custom Directives + + The declarative assembly format specification allows for handling a large + majority of the common cases when formatting an operation. For the operations + that require or desire specifying parts of the operation in a form not supported + by the declarative syntax, custom directives may be specified. A custom + directive essentially allows for users to use C++ for printing and parsing + subsections of an otherwise declaratively specified format. Looking at the + specification of a custom directive above: + + ``` + custom-directive ::= `custom` `<` UserDirective `>` `(` Params `)` + ``` + + A custom directive has two main parts: The `UserDirective` and the `Params`. A + custom directive is transformed into a call to a `print*` and a `parse*` method + when generating the C++ code for the format. The `UserDirective` is an + identifier used as a suffix to these two calls, i.e., `custom(...)` + would result in calls to `parseMyDirective` and `printMyDirective` within the + parser and printer respectively. `Params` may be any combination of variables + (i.e. Attribute, Operand, Successor, etc.), type directives, and `attr-dict`. + The type directives must refer to a variable, but that variable need not also be + a parameter to the custom directive. + + The arguments to the `parse` method are firstly a reference to + the `OpAsmParser`(`OpAsmParser &`), and secondly a set of output parameters + corresponding to the parameters specified in the format. The mapping of + declarative parameter to `parse` method argument is detailed below: + + * Attribute Variables + - Single: `(e.g. Attribute) &` + - Optional: `(e.g. Attribute) &` + * Operand Variables + - Single: `OpAsmParser::UnresolvedOperand &` + - Optional: `Optional &` + - Variadic: `SmallVectorImpl &` + - VariadicOfVariadic: + `SmallVectorImpl> &` + * Ref Directives + - A reference directive is passed to the parser using the same mapping as + the input operand. For example, a single region would be passed as a + `Region &`. + * Region Variables + - Single: `Region &` + - Variadic: `SmallVectorImpl> &` + * Successor Variables + - Single: `Block *&` + - Variadic: `SmallVectorImpl &` + * Type Directives + - Single: `Type &` + - Optional: `Type &` + - Variadic: `SmallVectorImpl &` + - VariadicOfVariadic: `SmallVectorImpl> &` + * `attr-dict` Directive: `NamedAttrList &` + + When a variable is optional, the value should only be specified if the variable + is present. Otherwise, the value should remain `None` or null. + + The arguments to the `print` method is firstly a reference to the + `OpAsmPrinter`(`OpAsmPrinter &`), second the op (e.g. `FooOp op` which can be + `Operation *op` alternatively), and finally a set of output parameters + corresponding to the parameters specified in the format. The mapping of + declarative parameter to `print` method argument is detailed below: + + * Attribute Variables + - Single: `(e.g. Attribute)` + - Optional: `(e.g. Attribute)` + * Operand Variables + - Single: `Value` + - Optional: `Value` + - Variadic: `OperandRange` + - VariadicOfVariadic: `OperandRangeRange` + * Ref Directives + - A reference directive is passed to the printer using the same mapping as + the input operand. For example, a single region would be passed as a + `Region &`. + * Region Variables + - Single: `Region &` + - Variadic: `MutableArrayRef` + * Successor Variables + - Single: `Block *` + - Variadic: `SuccessorRange` + * Type Directives + - Single: `Type` + - Optional: `Type` + - Variadic: `TypeRange` + - VariadicOfVariadic: `TypeRangeRange` + * `attr-dict` Directive: `DictionaryAttr` + + When a variable is optional, the provided value may be null. + + #### Optional Groups + + In certain situations operations may have "optional" information, e.g. + attributes or an empty set of variadic operands. In these situations a section + of the assembly format can be marked as `optional` based on the presence of this + information. An optional group is defined as follows: + + ``` + optional-group: `(` elements `)` (`:` `(` else-elements `)`)? `?` + ``` + + The `elements` of an optional group have the following requirements: + + * The first element of the group must either be a attribute, literal, operand, + or region. + - This is because the first element must be optionally parsable. + * Exactly one argument variable or type directive within the group must be + marked as the anchor of the group. + - The anchor is the element whose presence controls whether the group + should be printed/parsed. + - An element is marked as the anchor by adding a trailing `^`. + - The first element is *not* required to be the anchor of the group. + - When a non-variadic region anchors a group, the detector for printing + the group is if the region is empty. + * Literals, variables, custom directives, and type directives are the only + valid elements within the group. + - Any attribute variable may be used, but only optional attributes can be + marked as the anchor. + - Only variadic or optional results and operand arguments and can be used. + - All region variables can be used. When a non-variable length region is + used, if the group is not present the region is empty. + + An example of an operation with an optional group is `func.return`, which has a + variadic number of operands. + + ```tablegen + def ReturnOp : ... { + let arguments = (ins Variadic:$operands); + + // We only print the operands and types if there are a non-zero number + // of operands. + let assemblyFormat = "attr-dict ($operands^ `:` type($operands))?"; + } + ``` + + ##### Unit Attributes + + In MLIR, the [`unit` Attribute](Dialects/Builtin.md/#unitattr) is special in that it + only has one possible value, i.e. it derives meaning from its existence. When a + unit attribute is used to anchor an optional group and is not the first element + of the group, the presence of the unit attribute can be directly correlated with + the presence of the optional group itself. As such, in these situations the unit + attribute will not be printed or present in the output and will be automatically + inferred when parsing by the presence of the optional group itself. + + For example, the following operation: + + ```tablegen + def FooOp : ... { + let arguments = (ins UnitAttr:$is_read_only); + + let assemblyFormat = "attr-dict (`is_read_only` $is_read_only^)?"; + } + ``` + + would be formatted as such: + + ```mlir + // When the unit attribute is present: + foo.op is_read_only + + // When the unit attribute is not present: + foo.op + ``` + + ##### Optional "else" Group + + Optional groups also have support for an "else" group of elements. These are + elements that are parsed/printed if the `anchor` element of the optional group + is *not* present. Unlike the main element group, the "else" group has no + restriction on the first element and none of the elements may act as the + `anchor` for the optional. An example is shown below: + + ```tablegen + def FooOp : ... { + let arguments = (ins UnitAttr:$foo); + + let assemblyFormat = "attr-dict (`foo_is_present` $foo^):(`foo_is_absent`)?"; + } + ``` + + would be formatted as such: + + ```mlir + // When the `foo` attribute is present: + foo.op foo_is_present + + // When the `foo` attribute is not present: + foo.op foo_is_absent + ``` + + #### Requirements + + The format specification has a certain set of requirements that must be adhered + to: + + 1. The output and operation name are never shown as they are fixed and cannot + be altered. + 1. All operands within the operation must appear within the format, either + individually or with the `operands` directive. + 1. All regions within the operation must appear within the format, either + individually or with the `regions` directive. + 1. All successors within the operation must appear within the format, either + individually or with the `successors` directive. + 1. All operand and result types must appear within the format using the various + `type` directives, either individually or with the `operands` or `results` + directives. + 1. The `attr-dict` directive must always be present. + 1. Must not contain overlapping information; e.g. multiple instances of + 'attr-dict', types, operands, etc. + - Note that `attr-dict` does not overlap with individual attributes. These + attributes will simply be elided when printing the attribute dictionary. + + ##### Type Inference + + One requirement of the format is that the types of operands and results must + always be present. In certain instances, the type of a variable may be deduced + via type constraints or other information available. In these cases, the type of + that variable may be elided from the format. + + * Buildable Types + + Some type constraints may only have one representation, allowing for them to be + directly buildable; for example the `I32` or `Index` types. Types in `ODS` may + mark themselves as buildable by setting the `builderCall` field or inheriting + from the `BuildableType` class. + + * Trait Equality Constraints + + There are many operations that have known type equality constraints registered + as traits on the operation; for example the true, false, and result values of a + `select` operation often have the same type. The assembly format may inspect + these equal constraints to discern the types of missing variables. The currently + supported traits are: `AllTypesMatch`, `TypesMatchWith`, `SameTypeOperands`, and + `SameOperandsAndResultType`. + + * InferTypeOpInterface + + Operations that implement `InferTypeOpInterface` can omit their result types in + their assembly format since the result types can be inferred from the operands. + + ### `hasCanonicalizer` + + This boolean field indicate whether canonicalization patterns have been defined + for this operation. If it is `1`, then `::getCanonicalizationPatterns()` should + be defined. + + ### `hasCanonicalizeMethod` + + When this boolean field is set to `true`, it indicates that the op implements a + `canonicalize` method for simple "matchAndRewrite" style canonicalization + patterns. If `hasCanonicalizer` is 0, then an implementation of + `::getCanonicalizationPatterns()` is implemented to call this function. + + ### `hasFolder` + + This boolean field indicate whether general folding rules have been defined for + this operation. If it is `1`, then `::fold()` should be defined. + + ### Extra declarations + + One of the goals of table-driven op definition is to auto-generate as much logic + and methods needed for each op as possible. With that said, there will always be + long-tail cases that won't be covered. For such cases, you can use + `extraClassDeclaration`. Code in `extraClassDeclaration` will be copied + literally to the generated C++ op class. + + Note that `extraClassDeclaration` is a mechanism intended for long-tail cases by + power users; for not-yet-implemented widely-applicable cases, improving the + infrastructure is preferable. + + ### Extra definitions + + When defining base op classes in TableGen that are inherited many times by + different ops, users may want to provide common definitions of utility and + interface functions. However, many of these definitions may not be desirable or + possible in `extraClassDeclaration`, which append them to the op's C++ class + declaration. In these cases, users can add an `extraClassDefinition` to define + code that is added to the generated source file inside the op's C++ namespace. + The substitution `$cppClass` is replaced by the op's C++ class name. + + ### Generated C++ code + + [OpDefinitionsGen][OpDefinitionsGen] processes the op definition spec file and + generates two files containing the corresponding C++ code: one for declarations, + the other for definitions. The former is generated via the `-gen-op-decls` + command-line option, while the latter is via the `-gen-op-defs` option. + + The definition file contains all the op method definitions, which can be + included and enabled by defining `GET_OP_CLASSES`. For each operation, + OpDefinitionsGen generates an operation class and an + [operand adaptor](#operand-adaptors) class. Besides, it also contains a + comma-separated list of all defined ops, which can be included and enabled by + defining `GET_OP_LIST`. + + #### Class name and namespaces + + For each operation, its generated C++ class name is the symbol `def`ed with + TableGen with dialect prefix removed. The first `_` serves as the delimiter. For + example, for `def TF_AddOp`, the C++ class name would be `AddOp`. We remove the + `TF` prefix because it is for scoping ops; other dialects may as well define + their own `AddOp`s. + + The namespaces of the generated C++ class will come from the dialect's + `cppNamespace` field. For example, if a dialect's `cppNamespace` is `A::B`, then + an op of that dialect will be placed in `namespace A { namespace B { ... } }`. + If a dialect does not specify a `cppNamespace`, we then use the dialect's name + as the namespace. + + This means the qualified name of the generated C++ class does not necessarily + match exactly with the operation name as explained in + [Operation name](#operation-name). This is to allow flexible naming to satisfy + coding style requirements. + + #### Operand adaptors + + For each operation, we automatically generate an _operand adaptor_. This class + solves the problem of accessing operands provided as a list of `Value`s without + using "magic" constants. The operand adaptor takes a reference to an array of + `Value` and provides methods with the same names as those in the operation class + to access them. For example, for a binary arithmetic operation, it may provide + `.lhs()` to access the first operand and `.rhs()` to access the second operand. + + The operand adaptor class lives in the same namespace as the operation class, + and has the name of the operation followed by `Adaptor` as well as an alias + `Adaptor` inside the op class. + + Operand adaptors can be used in function templates that also process operations: + + ```c++ + template + std::pair zip(BinaryOpTy &&op) { + return std::make_pair(op.lhs(), op.rhs());; + } + + void process(AddOp op, ArrayRef newOperands) { + zip(op); + zip(Adaptor(newOperands)); + /*...*/ + } + ``` + + ## Constraints + + Constraint is a core concept in table-driven operation definition: operation + verification and graph operation matching are all based on satisfying + constraints. So both the operation definition and rewrite rules specification + significantly involve writing constraints. We have the `Constraint` class in + [`OpBase.td`][OpBase] as the common base class for all constraints. + + An operation's constraint can cover different range; it may + + * Only concern a single attribute (e.g. being a 32-bit integer greater than + 5), + * Multiple operands and results (e.g., the 1st result's shape must be the same + as the 1st operand), or + * Intrinsic to the operation itself (e.g., having no side effect). + + We call them as single-entity constraint, multi-entity constraint, and traits, + respectively. + + ### Single-entity constraint + + Constraints scoped to a single operand, attribute, or result are specified at + the entity's declaration place as described in + [Operation arguments](#operation-arguments) and + [Operation results](#operation-results). + + To help modelling constraints of common types, a set of `TypeConstraint`s are + created; they are the `Type` subclass hierarchy. It includes `F32` for the + constraints of being a float, `TensorOf<[F32]>` for the constraints of being a + float tensor, and so on. + + Similarly, a set of `AttrConstraint`s are created for helping modelling + constraints of common attribute kinds. They are the `Attr` subclass hierarchy. + It includes `F32Attr` for the constraints of being a float attribute, + `F32ArrayAttr` for the constraints of being a float array attribute, and so on. + + ### Multi-entity constraint + + Constraints involving more than one operand/attribute/result are quite common on + operations, like the element type and shape relation between operands and + results. These constraints should be specified as the `Op` class template + parameter as described in + [Operation traits and constraints](#operation-traits-and-constraints). + + Multi-entity constraints are modeled as `PredOpTrait` (a subclass of `OpTrait`) + in [`OpBase.td`][OpBase].A bunch of constraint primitives are provided to help + specification. See [`OpBase.td`][OpBase] for the complete list. + + ### Trait + + Traits are intrinsic properties of the operation like having side effect or not, + commutative or not, whether is a terminator, etc. These constraints should be + specified as the `Op` class template parameter as described in + [Operation traits and constraints](#operation-traits-and-constraints). + + Traits are modeled as `NativeOpTrait` (a subclass of `OpTrait`) in + [`OpBase.td`][OpBase]. They are backed and will be translated into the + corresponding C++ `mlir::OpTrait` classes. + + ### How to specify new constraint + + To write a constraint, you need to provide its predicates and give it a + descriptive name. Predicates, modeled with the `Pred` class, are the workhorse + for composing constraints. The predicate for a constraint is typically built up + in a nested manner, using the two categories of predicates: + + 1. `CPred`: the primitive leaf predicate. + 2. Compound predicate: a predicate composed from child predicates using + predicate combiners (conjunction: `And`, disjunction: `Or`, negation: `Neg`, + substitution: `SubstLeaves`, concatenation: `Concat`). + + `CPred` is the basis for composing more complex predicates. It is the "atom" + predicate from the perspective of TableGen and the "interface" between TableGen + and C++. What is inside is already C++ code, which will be treated as opaque + strings with special placeholders to be substituted. + + You can put any C++ code that returns a boolean value inside a `CPred`, + including evaluating expressions, calling functions, calling class methods, and + so on. + + To help interaction with the C++ environment, there are a few special + placeholders provided to refer to entities in the context where this predicate + is used. They serve as "hooks" to the enclosing environment. This includes + `$_builder`, `$_op`, and `$_self`: + + * `$_builder` will be replaced by a `mlir::Builder` instance so that you can + access common build methods. + * `$_op` will be replaced by the current operation so that you can access + information of the current operation. + * `$_self` will be replaced with the entity this predicate is attached to. + E.g., `BoolAttr` is an attribute constraint that wraps a + `CPred<"$_self.isa()">`. Then for `BoolAttr:$attr`,`$_self` will be + replaced by `$attr`. For type constraints, it's a little bit special since + we want the constraints on each type definition reads naturally and we want + to attach type constraints directly to an operand/result, `$_self` will be + replaced by the operand/result's type. E.g., for `F32` in `F32:$operand`, + its `$_self` will be expanded as `operand(...).getType()`. + + TODO: Reconsider the leading symbol for special placeholders. Eventually we want + to allow referencing operand/result `$-name`s; such `$-name`s can start with + underscore. + + For example, to write an attribute `attr` is an `IntegerAttr`, in C++ you can + just call `attr.isa()`. The code can be wrapped in a `CPred` as + `$_self.isa()`, with `$_self` as the special placeholder to be + replaced by the current attribute `attr` at expansion time. + + For more complicated predicates, you can wrap it in a single `CPred`, or you can + use predicate combiners to combine them. For example, to write the constraint + that an attribute `attr` is a 32-bit or 64-bit integer, you can write it as + + ```tablegen + And<[ + CPred<"$_self.isa()">, + Or<[ + CPred<"$_self.cast().getType().isInteger(32)">, + CPred<"$_self.cast().getType().isInteger(64)"> + ]> + ]> + ``` + + (Note that the above is just to show with a familiar example how you can use + `CPred` and predicate combiners to write complicated predicates. For integer + attributes specifically, [`OpBase.td`][OpBase] already defines `I32Attr` and + `I64Attr`. So you can actually reuse them to write it as `Or<[I32Attr.predicate, + I64Attr.predicate]>`.) + + TODO: Build up a library of reusable primitive constraints + + If the predicate is very complex to write with `CPred` together with predicate + combiners, you can also write it as a normal C++ function and use the `CPred` as + a way to "invoke" the function. For example, to verify an attribute `attr` has + some property, you can write a C++ function like + + ```cpp + bool HasSomeProperty(Attribute attr) { ... } + ``` + + and then define the op as: + + ```tablegen + def HasSomeProperty : AttrConstraint, + "has some property">; + + def MyOp : Op<...> { + let arguments = (ins + ... + HasSomeProperty:$attr + ); + } + ``` + + As to whether we should define the predicate using a single `CPred` wrapping the + whole expression, multiple `CPred`s with predicate combiners, or a single + `CPred` "invoking" a function, there are no clear-cut criteria. Defining using + `CPred` and predicate combiners is preferable since it exposes more information + (instead hiding all the logic behind a C++ function) into the op definition spec + so that it can potentially drive more auto-generation cases. But it will require + a nice library of common predicates as the building blocks to avoid the + duplication, which is being worked on right now. + + ## Attribute Definition + + An attribute is a compile-time known constant of an operation. + + ODS provides attribute wrappers over C++ attribute classes. There are a few + common C++ [attribute classes][AttrClasses] defined in MLIR's core IR library + and one is free to define dialect-specific attribute classes. ODS allows one to + use these attributes in TableGen to define operations, potentially with more + fine-grained constraints. For example, `StrAttr` directly maps to `StringAttr`; + `F32Attr`/`F64Attr` requires the `FloatAttr` to additionally be of a certain + bitwidth. + + ODS attributes are defined as having a storage type (corresponding to a backing + `mlir::Attribute` that _stores_ the attribute), a return type (corresponding to + the C++ _return_ type of the generated helper getters) as well as a method + to convert between the internal storage and the helper method. + + ### Attribute decorators + + There are a few important attribute adapters/decorators/modifiers that can be + applied to ODS attributes to specify common additional properties like + optionality, default values, etc.: + + * `DefaultValuedAttr`: specifies the + [default value](#attributes-with-default-values) for an attribute. + * `OptionalAttr`: specifies an attribute as [optional](#optional-attributes). + * `Confined`: adapts an attribute with + [further constraints](#confining-attributes). + + ### Enum attributes + + Some attributes can only take values from a predefined enum, e.g., the + comparison kind of a comparison op. To define such attributes, ODS provides + several mechanisms: `StrEnumAttr`, `IntEnumAttr`, and `BitEnumAttr`. + + * `StrEnumAttr`: each enum case is a string, the attribute is stored as a + [`StringAttr`][StringAttr] in the op. + * `IntEnumAttr`: each enum case is an integer, the attribute is stored as a + [`IntegerAttr`][IntegerAttr] in the op. + * `BitEnumAttr`: each enum case is a either the empty case, a single bit, + or a group of single bits, and the attribute is stored as a + [`IntegerAttr`][IntegerAttr] in the op. + + All these `*EnumAttr` attributes require fully specifying all of the allowed + cases via their corresponding `*EnumAttrCase`. With this, ODS is able to + generate additional verification to only accept allowed cases. To facilitate the + interaction between `*EnumAttr`s and their C++ consumers, the + [`EnumsGen`][EnumsGen] TableGen backend can generate a few common utilities: a + C++ enum class, `llvm::DenseMapInfo` for the enum class, conversion functions + from/to strings. This is controlled via the `-gen-enum-decls` and + `-gen-enum-defs` command-line options of `mlir-tblgen`. + + For example, given the following `EnumAttr`: + + ```tablegen + def Case15: I32EnumAttrCase<"Case15", 15>; + def Case20: I32EnumAttrCase<"Case20", 20>; + + def MyIntEnum: I32EnumAttr<"MyIntEnum", "An example int enum", + [Case15, Case20]> { + let cppNamespace = "Outer::Inner"; + let stringToSymbolFnName = "ConvertToEnum"; + let symbolToStringFnName = "ConvertToString"; + } + ``` + + The following will be generated via `mlir-tblgen -gen-enum-decls`: + + ```c++ + namespace Outer { + namespace Inner { + // An example int enum + enum class MyIntEnum : uint32_t { + Case15 = 15, + Case20 = 20, + }; + + llvm::Optional symbolizeMyIntEnum(uint32_t); + llvm::StringRef ConvertToString(MyIntEnum); + llvm::Optional ConvertToEnum(llvm::StringRef); + inline constexpr unsigned getMaxEnumValForMyIntEnum() { + return 20; + } + + } // namespace Inner + } // namespace Outer + + namespace llvm { + template<> struct DenseMapInfo { + using StorageInfo = llvm::DenseMapInfo; + + static inline Outer::Inner::MyIntEnum getEmptyKey() { + return static_cast(StorageInfo::getEmptyKey()); + } + + static inline Outer::Inner::MyIntEnum getTombstoneKey() { + return static_cast(StorageInfo::getTombstoneKey()); + } + + static unsigned getHashValue(const Outer::Inner::MyIntEnum &val) { + return StorageInfo::getHashValue(static_cast(val)); + } + + static bool isEqual(const Outer::Inner::MyIntEnum &lhs, const Outer::Inner::MyIntEnum &rhs) { + return lhs == rhs; + } + }; + } + ``` + + The following will be generated via `mlir-tblgen -gen-enum-defs`: + + ```c++ + namespace Outer { + namespace Inner { + llvm::StringRef ConvertToString(MyIntEnum val) { + switch (val) { + case MyIntEnum::Case15: return "Case15"; + case MyIntEnum::Case20: return "Case20"; + } + return ""; + } + + llvm::Optional ConvertToEnum(llvm::StringRef str) { + return llvm::StringSwitch>(str) + .Case("Case15", MyIntEnum::Case15) + .Case("Case20", MyIntEnum::Case20) + .Default(llvm::None); + } + llvm::Optional symbolizeMyIntEnum(uint32_t value) { + switch (value) { + case 15: return MyIntEnum::Case15; + case 20: return MyIntEnum::Case20; + default: return llvm::None; + } + } + + } // namespace Inner + } // namespace Outer + ``` + + Similarly for the following `BitEnumAttr` definition: + + ```tablegen + def None: BitEnumAttrCaseNone<"None">; + def Bit0: BitEnumAttrCaseBit<"Bit0", 0>; + def Bit1: BitEnumAttrCaseBit<"Bit1", 1>; + def Bit2: BitEnumAttrCaseBit<"Bit2", 2>; + def Bit3: BitEnumAttrCaseBit<"Bit3", 3>; + + def MyBitEnum: BitEnumAttr<"MyBitEnum", "An example bit enum", + [None, Bit0, Bit1, Bit2, Bit3]>; + ``` + + We can have: + + ```c++ + // An example bit enum + enum class MyBitEnum : uint32_t { + None = 0, + Bit0 = 1, + Bit1 = 2, + Bit2 = 4, + Bit3 = 8, + }; + + llvm::Optional symbolizeMyBitEnum(uint32_t); + std::string stringifyMyBitEnum(MyBitEnum); + llvm::Optional symbolizeMyBitEnum(llvm::StringRef); + inline MyBitEnum operator|(MyBitEnum lhs, MyBitEnum rhs) { + return static_cast(static_cast(lhs) | static_cast(rhs)); + } + inline MyBitEnum operator&(MyBitEnum lhs, MyBitEnum rhs) { + return static_cast(static_cast(lhs) & static_cast(rhs)); + } + inline bool bitEnumContains(MyBitEnum bits, MyBitEnum bit) { + return (static_cast(bits) & static_cast(bit)) != 0; + } + + namespace llvm { + template<> struct DenseMapInfo<::MyBitEnum> { + using StorageInfo = llvm::DenseMapInfo; + + static inline ::MyBitEnum getEmptyKey() { + return static_cast<::MyBitEnum>(StorageInfo::getEmptyKey()); + } + + static inline ::MyBitEnum getTombstoneKey() { + return static_cast<::MyBitEnum>(StorageInfo::getTombstoneKey()); + } + + static unsigned getHashValue(const ::MyBitEnum &val) { + return StorageInfo::getHashValue(static_cast(val)); + } + + static bool isEqual(const ::MyBitEnum &lhs, const ::MyBitEnum &rhs) { + return lhs == rhs; + } + }; + ``` + + ```c++ + std::string stringifyMyBitEnum(MyBitEnum symbol) { + auto val = static_cast(symbol); + assert(15u == (15u | val) && "invalid bits set in bit enum"); + // Special case for all bits unset. + if (val == 0) return "None"; + llvm::SmallVector strs; + if (1u == (1u & val)) { strs.push_back("Bit0"); } + if (2u == (2u & val)) { strs.push_back("Bit1"); } + if (4u == (4u & val)) { strs.push_back("Bit2"); } + if (8u == (8u & val)) { strs.push_back("Bit3"); } + + return llvm::join(strs, "|"); + } + + llvm::Optional symbolizeMyBitEnum(llvm::StringRef str) { + // Special case for all bits unset. + if (str == "None") return MyBitEnum::None; + + llvm::SmallVector symbols; + str.split(symbols, "|"); + + uint32_t val = 0; + for (auto symbol : symbols) { + auto bit = llvm::StringSwitch>(symbol) + .Case("Bit0", 1) + .Case("Bit1", 2) + .Case("Bit2", 4) + .Case("Bit3", 8) + .Default(llvm::None); + if (bit) { val |= *bit; } else { return llvm::None; } + } + return static_cast(val); + } + + llvm::Optional symbolizeMyBitEnum(uint32_t value) { + // Special case for all bits unset. + if (value == 0) return MyBitEnum::None; + + if (value & ~(1u | 2u | 4u | 8u)) return llvm::None; + return static_cast(value); + } + ``` + + ## Debugging Tips + + ### Run `mlir-tblgen` to see the generated content + + TableGen syntax sometimes can be obscure; reading the generated content can be a + very helpful way to understand and debug issues. To build `mlir-tblgen`, run + `cmake --build . --target mlir-tblgen` in your build directory and find the + `mlir-tblgen` binary in the `bin/` subdirectory. All the supported generators + can be found via `mlir-tblgen --help`. For example, `--gen-op-decls` and + `--gen-op-defs` as explained in [Generated C++ code](#generated-c-code). + + To see the generated code, invoke `mlir-tblgen` with a specific generator by + providing include paths via `-I`. For example, + + ```sh + # To see op C++ class declaration + mlir-tblgen --gen-op-decls -I /path/to/mlir/include /path/to/input/td/file + # To see op C++ class definition + mlir-tblgen --gen-op-defs -I /path/to/mlir/include /path/to/input/td/file + # To see op documentation + mlir-tblgen --gen-dialect-doc -I /path/to/mlir/include /path/to/input/td/file + + # To see op interface C++ class declaration + mlir-tblgen --gen-op-interface-decls -I /path/to/mlir/include /path/to/input/td/file + # To see op interface C++ class definition + mlir-tblgen --gen-op-interface-defs -I /path/to/mlir/include /path/to/input/td/file + # To see op interface documentation + mlir-tblgen --gen-op-interface-doc -I /path/to/mlir/include /path/to/input/td/file + ``` + + ## Appendix + + ### Reporting deprecation + + Classes/defs can be marked as deprecated by using the `Deprecate` helper class, + e.g., + + ```td + def OpTraitA : NativeOpTrait<"OpTraitA">, Deprecated<"use `bar` instead">; + ``` + + would result in marking `OpTraitA` as deprecated and mlir-tblgen can emit a + warning (default) or error (depending on `-on-deprecated` flag) to make + deprecated state known. + + ### Requirements and existing mechanisms analysis + + The op description should be as declarative as possible to allow a wide range of + tools to work with them and query methods generated from them. In particular + this means specifying traits, constraints and shape inference information in a + way that is easily analyzable (e.g., avoid opaque calls to C++ functions where + possible). + + We considered the approaches of several contemporary systems and focused on + requirements that were desirable: + + * Ops registered using a registry separate from C++ code. + * Unknown ops are allowed in MLIR, so ops need not be registered. The + ability of the compiler to optimize those ops or graphs containing those + ops is constrained but correct. + * The current proposal does not include a runtime op description, but it + does not preclude such description, it can be added later. + * The op registry is essential for generating C++ classes that make + manipulating ops, verifying correct construction etc. in C++ easier by + providing a typed representation and accessors. + * The op registry will be defined in + [TableGen](https://llvm.org/docs/TableGen/index.html) and be used to + generate C++ classes and utility functions + (builder/verifier/parser/printer). + * TableGen is a modelling specification language used by LLVM's backends + and fits in well with trait-based modelling. This is an implementation + decision and there are alternative ways of doing this. But the + specification language is good for the requirements of modelling the + traits (as seen from usage in LLVM processor backend modelling) and easy + to extend, so a practical choice. If another good option comes up, we + will consider it. + * MLIR allows both defined and undefined ops. + * Defined ops should have fixed semantics and could have a corresponding + reference implementation defined. + * Dialects are under full control of the dialect owner and normally live + with the framework of the dialect. + * The op's traits (e.g., commutative) are modelled along with the op in the + registry. + * The op's operand/return type constraints are modelled along with the op in + the registry (see [Shape inference](ShapeInference.md) discussion below), + this allows (e.g.) optimized concise syntax in textual dumps. + * Behavior of the op is documented along with the op with a summary and a + description. The description is written in markdown and extracted for + inclusion in the generated LangRef section of the dialect. + * The generic assembly form of printing and parsing is available as normal, + but a custom parser and printer can either be specified or automatically + generated from an optional string representation showing the mapping of the + "assembly" string to operands/type. + * Parser-level remappings (e.g., `eq` to enum) will be supported as part + of the parser generation. + * Matching patterns are specified separately from the op description. + * Contrasted with LLVM there is no "base" set of ops that every backend + needs to be aware of. Instead there are many different dialects and the + transformations/legalizations between these dialects form a graph of + transformations. + * Reference implementation may be provided along with the op definition. + + * The reference implementation may be in terms of either standard ops or + other reference implementations. + + TODO: document expectation if the dependent op's definition changes. + + [TableGen]: https://llvm.org/docs/TableGen/index.html + [TableGenProgRef]: https://llvm.org/docs/TableGen/ProgRef.html + [TableGenBackend]: https://llvm.org/docs/TableGen/BackEnds.html#introduction + [OpBase]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/OpBase.td + [OpDefinitionsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp + [EnumsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/EnumsGen.cpp + [StringAttr]: Dialects/Builtin.md/#stringattr + [IntegerAttr]: Dialects/Builtin.md/#integertype + [AttrClasses]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/Attributes.h diff --git a/mlir/docs/PDLL.md b/mlir/docs/PDLL.md --- a/mlir/docs/PDLL.md +++ b/mlir/docs/PDLL.md @@ -33,7 +33,7 @@ ### Why build a new language instead of improving TableGen DRR? -Note: The section assumes familiarity with +Note: This section assumes familiarity with [TDRR](https://mlir.llvm.org/docs/DeclarativeRewrites/), please refer the relevant documentation before continuing. diff --git a/mlir/docs/PDLL.md.rej b/mlir/docs/PDLL.md.rej new file mode 100644 --- /dev/null +++ b/mlir/docs/PDLL.md.rej @@ -0,0 +1,1402 @@ +diff a/mlir/docs/PDLL.md b/mlir/docs/PDLL.md (rejected hunks) +@@ -1,1399 +1,1399 @@ + # PDLL - PDL Language + + This document details the PDL Language (PDLL), a custom frontend language for + writing pattern rewrites targeting MLIR. + + Note: This document assumes a familiarity with MLIR concepts; more specifically + the concepts detailed within the + [MLIR Pattern Rewriting](https://mlir.llvm.org/docs/PatternRewriter/) and + [Operation Definition Specification (ODS)](https://mlir.llvm.org/docs/OpDefinitions/) + documentation. + + [TOC] + + ## Introduction + + Pattern matching is an extremely important component within MLIR, as it + encompasses many different facets of the compiler. From canonicalization, to + optimization, to conversion; every MLIR based compiler will heavily rely on the + pattern matching infrastructure in some capacity. + + The PDL Language (PDLL) provides a declarative pattern language designed from + the ground up for representing MLIR pattern rewrites. PDLL is designed to + natively support writing matchers on all of MLIRs constructs via an intuitive + interface that may be used for both ahead-of-time (AOT) and just-in-time (JIT) + pattern compilation. + + ## Rationale + + This section provides details on various design decisions, their rationale, and + alternatives considered when designing PDLL. Given the nature of software + development, this section may include references to areas of the MLIR compiler + that no longer exist. + + ### Why build a new language instead of improving TableGen DRR? + +-Note: The section assumes familiarity with ++Note: This section assumes familiarity with + [TDRR](https://mlir.llvm.org/docs/DeclarativeRewrites/), please refer the + relevant documentation before continuing. + + Tablegen DRR (TDRR), i.e. + [Table-driven Declarative Rewrite Rules](https://mlir.llvm.org/docs/DeclarativeRewrites/), + is a declarative DSL for defining MLIR pattern rewrites within the + [TableGen](https://llvm.org/docs/TableGen/index.html) language. This + infrastructure is currently the main way in which patterns may be defined + declaratively within MLIR. TDRR utilizes TableGen's `dag` support to enable + defining MLIR patterns that fit nicely within a DAG structure; in a similar way + in which tablegen has been used to defined patterns for LLVM's backend + infrastructure (SelectionDAG/Global Isel/etc.). Unfortunately however, the + TableGen language is not as amenable to the structure of MLIR patterns as it has + been for LLVM. + + The issues with TDRR largely stem from the use of TableGen as the host language + for the DSL. These issues have risen from a mismatch in the structure of + TableGen compared to the structure of MLIR, and from TableGen having different + motivational goals than MLIR. A majority (or all depending on how stubborn you + are) of the issues that we've come across with TDRR have been addressable in + some form; the sticking point here is that the solutions to these problems have + often been more "creative" than we'd like. This is a problem, and why we decided + not to invest a larger effort into improving TDRR; users generally don't want + "creative" APIs, they want something that is intuitive to read/write. + + To highlight some of these issues, below we will take a tour through some of the + problems that have arisen, and how we "fixed" them. + + #### Multi-result operations + + MLIR natively supports a variable number of operation results. For the DAG based + structure of TDRR, any form of multiple results (operations in this instance) + creates a problem. This is because the DAG wants a single root node, and does + not have nice facilities for indexing or naming the multiple results. Let's take + a look at a quick example to see how this manifests: + + ```tablegen + // Suppose we have a three result operation, defined as seen below. + def ThreeResultOp : Op<"three_result_op"> { + let arguments = (ins ...); + + let results = (outs + AnyTensor:$output1, + AnyTensor:$output2, + AnyTensor:$output3 + ); + } + + // To bind the results of `ThreeResultOp` in a TDRR pattern, we bind all results + // to a single name and use a special naming convention: `__N`, where `N` is the + // N-th result. + def : Pattern<(ThreeResultOp:$results ...), + [(... $results__0), ..., (... $results__2), ...]>; + ``` + + In TDRR, we "solved" the problem of accessing multiple results, but this isn't a + very intuitive interface for users. Magical naming conventions obfuscate the + code and can easily introduce bugs and other errors. There are various things + that we could try to improve this situation, but there is a fundamental limit to + what we can do given the limits of the TableGen dag structure. In PDLL, however, + we have the freedom and flexibility to provide a proper interface into + operations, regardless of their structure: + + ```pdll + // Import our definition of `ThreeResultOp`. + #include "ops.td" + + Pattern { + ... + + // In PDLL, we can directly reference the results of an operation variable. + // This provides a closer mental model to what the user expects. + let threeResultOp = op; + let userOp = op(threeResultOp.output1, ..., threeResultOp.output3); + + ... + } + ``` + + #### Constraints + + In TDRR, the match dag defines the general structure of the input IR to match. + Any non-structural/non-type constraints on the input are generally relegated to + a list of constraints specified after the rewrite dag. For very simple patterns + this may suffice, but with larger patterns it becomes quite problematic as it + separates the constraint from the entity it constrains and negatively impacts + the readability of the pattern. As an example, let's look at a simple pattern + that adds additional constraints to its inputs: + + ```tablegen + // Suppose we have a two result operation, defined as seen below. + def TwoResultOp : Op<"two_result_op"> { + let arguments = (ins ...); + + let results = (outs + AnyTensor:$output1, + AnyTensor:$output2 + ); + } + + // A simple constraint to check if a value is use_empty. + def HasNoUseOf: Constraint, "has no use">; + + // Check if two values have a ShapedType with the same element type. + def HasSameElementType : Constraint< + CPred<"$0.getType().cast().getElementType() == " + "$1.getType().cast().getElementType()">, + "values have same element type">; + + def : Pattern<(TwoResultOp:$results $input), + [(...), (...)], + [(HasNoUseOf:$results__1), + (HasSameElementType $results__0, $input)]>; + ``` + + Above, when observing the constraints we need to search through the input dag + for the inputs (also keeping in mind the magic naming convention for multiple + results). For this simple pattern it may be just a few lines above, but complex + patterns often grow to 10s of lines long. In PDLL, these constraints can be + applied directly on or next to the entities they apply to: + + ```pdll + // The same constraints that we defined above: + Constraint HasNoUseOf(value: Value) [{ + return success(value.use_empty()); + }]; + Constraint HasSameElementType(value1: Value, value2: Value) [{ + return success(value1.getType().cast().getElementType() == + value2.getType().cast().getElementType()); + }]; + + Pattern { + // In PDLL, we can apply the constraint as early (or as late) as we want. This + // enables better structuring of the matcher code, and improves the + // readability/maintainability of the pattern. + let op = op(input: Value); + HasNoUseOf(op.output2); + HasSameElementType(input, op.output2); + + // ... + } + ``` + + #### Replacing Multiple Operations + + Often times a pattern will transform N number of input operations into N number + of result operations. In PDLL, replacing multiple operations is as simple as + adding two [`replace` statements](#replace-statement). In TDRR, the situation is + a bit more nuanced. Given the single root structure of the TableGen dag, + replacing a non-root operation is not nicely supported. It currently isn't + natively possible, and instead requires using multiple patterns. We could + potentially add another special rewrite directive, or extend `replaceWithValue`, + but this simply highlights how even a basic IR transformation is muddled by the + complexity of the host language. + + ### Why not build a DSL in "X"? + + Yes! Well yes and no. To understand why, we have to consider what types of users + we are trying to serve and what constraints we enforce upon them. The goal of + PDLL is to provide a default and effective pattern language for MLIR that all + users of MLIR can interact with immediately, regardless of their host + environment. This language is available with no extra dependencies and comes + "free" along with MLIR. If we were to use an existing host language to build our + new DSL, we would need to make compromises along with it depending on the + language. For some, there are questions of how to enforce matching environments + (python2 or python3?, which version?), performance considerations, integration, + etc. As an LLVM project, this could also mean enforcing a new language + dependency on the users of MLIR (many of which may not want/need such a + dependency otherwise). Another issue that comes along with any DSL that is + embeded in another language: mitigating the user impedance mismatch between what + the user expects from the host language and what our "backend" supports. For + example, the PDL IR abstraction only contains limited support for control flow. + If we were to build a DSL in python, we would need to ensure that complex + control flow is either handled completely or effectively errors out. Even with + ideal error handling, not having the expected features available creates user + frustration. In addition to the environment constraints, there is also the issue + of language tooling. With PDLL we intend to build a very robust and modern + toolset that is designed to cater the needs of pattern developers, including + code completion, signature help, and many more features that are specific to the + problem we are solving. Integrating custom language tooling into existing + languages can be difficult, and in some cases impossible (as our DSL would + merely be a small subset of the existing language). + + These various points have led us to the initial conclusion that the most + effective tool we can provide for our users is a custom tool designed for the + problem at hand. With all of that being said, we understand that not all users + have the same constraints that we have placed upon ourselves. We absolutely + encourage and support the existence of various PDL frontends defined in + different languages. This is one of the original motivating factors around + building the PDL IR abstraction in the first place; to enable innovation and + flexibility for our users (and in turn their users). For some, such as those in + research and the Machine Learning space, they may already have a certain + language (such as Python) heavily integrated into their workflow. For these + users, a PDL DSL in their language may be ideal and we will remain committed to + supporting and endorsing that from an infrastructure point-of-view. + + ## Language Specification + + Note: PDLL is still under active development, and the designs discussed below + are not necessarily final and may be subject to change. + + The design of PDLL is heavily influenced and centered around the + [PDL IR abstraction](https://mlir.llvm.org/docs/Dialects/PDLOps/), which in turn + is designed as an abstract model of the core MLIR structures. This leads to a + design and structure that feels very similar to if you were directly writing the + IR you want to match. + + ### Includes + + PDLL supports an `include` directive to import content defined within other + source files. There are two types of files that may be included: `.pdll` and + `.td` files. + + #### `.pdll` includes + + When including a `.pdll` file, the contents of that file are copied directly into + the current file being processed. This means that any patterns, constraints, + rewrites, etc., defined within that file are processed along with those within + the current file. + + #### `.td` includes + + When including a `.td` file, PDLL will automatically import any pertinent + [ODS](https://mlir.llvm.org/docs/OpDefinitions/) information within that file. + This includes any defined operations, constraints, interfaces, and more, making + them implicitly accessible within PDLL. This is important, as ODS information + allows for certain PDLL constructs, such as the + [`operation` expression](#operation), to become much more powerful. + + ### Patterns + + In any pattern descriptor language, pattern definition is at the core. In PDLL, + patterns start with `Pattern` optionally followed by a name and a set of pattern + metadata, and finally terminated by a pattern body. A few simple examples are + shown below: + + ```pdll + // Here we have defined an anonymous pattern: + Pattern { + // Pattern bodies are separated into two components: + // * Match Section + // - Describes the input IR. + let root = op(op(arg: Value)); + + // * Rewrite Section + // - Describes how to transform the IR. + // - Last statement starts the rewrite. + replace root with op(arg); + } + + // Here we have defined a pattern named `ReshapeReshapeOptPattern` with a + // benefit of 10: + Pattern ReshapeReshapeOptPattern with benefit(10) { + replace op(op(arg: Value)) + with op(arg); + } + ``` + + After the definition of the pattern metadata, we specify the pattern body. The + structure of a pattern body is comprised of two main sections, the `match` + section and the `rewrite` section. The `match` section of a pattern describes + the expected input IR, whereas the `rewrite` section describes how to transform + that IR. This distinction is an important one to make, as PDLL handles certain + variables and expressions differently within the different sections. When + relevant in each of the sections below, we shall explicitly call out any + behavioral differences. + + The general layout of the `match` and `rewrite` section is as follows: the + *last* statement of the pattern body is required to be a + [`operation rewrite statement`](#operation-rewrite-statements), and denotes the + `rewrite` section; every statement before denotes the `match` section. + + #### Pattern metadata + + Rewrite patterns in MLIR have a set of metadata that allow for controlling + certain behaviors, and providing information to the rewrite driver applying the + pattern. In PDLL, a pattern can provide a non-default value for this metadata + after the pattern name. Below, examples are shown for the different types of + metadata supported: + + ##### Benefit + + The benefit of a Pattern is an integer value that represents the "benefit" of + matching that pattern. It is used by pattern drivers to determine the relative + priorities of patterns during application; a pattern with a higher benefit is + generally applied before one with a lower benefit. + + In PDLL, a pattern has a default benefit set to the number of input operations, + i.e. the number of distinct `Op` expressions/variables, in the match section. This + rule is driven by an observation that larger matches are more beneficial than smaller + ones, and if a smaller one is applied first the larger one may not apply anymore. + Patterns can override this behavior by specifying the benefit in the metadata section + of the pattern: + + ```pdll + // Here we specify that this pattern has a benefit of `10`, overriding the + // default behavior. + Pattern with benefit(10) { + ... + } + ``` + + ##### Bounded Rewrite Recursion + + During pattern application, there are situations in which a pattern may be + applicable to the result of a previous application of that same pattern. If the + pattern does not properly handle this recusive application, the pattern driver + could become stuck in an infinite loop of application. To prevent this, patterns + by-default are assumed to not have proper recursive bounding and will not be + recursively applied. A pattern can signal that it does have proper handling for + recursion by specifying the `recusion` flag in the pattern metadata section: + + ```pdll + // Here we signal that this pattern properly bounds recursive application. + Pattern with recusion { + ... + } + ``` + + #### Single Line "Lambda" Body + + Patterns generally define their body using a compound block of statements, as + shown below: + + ```pdll + Pattern { + replace op(operands: ValueRange) with operands; + } + ``` + + Patterns also support a lambda-like syntax for specifying simple single line + bodies. The lambda body of a Pattern expects a single + [operation rewrite statement](#operation-rewrite-statements): + + ```pdll + Pattern => replace op(operands: ValueRange) with operands; + ``` + + ### Variables + + Variables in PDLL represent specific instances of IR entities, such as `Value`s, + `Operation`s, `Type`s, etc. Consider the simple pattern below: + + ```pdll + Pattern { + let value: Value; + let root = op(value); + + replace root with value; + } + ``` + + In this pattern we define two variables, `value` and `root`, using the `let` + statement. The `let` statement allows for defining variables and constraining + them. Every variable in PDLL is of a certain type, which defines the type of IR + entity the variable represents. The type of a variable may be determined via + either a constraint, or an initializer expression. + + #### Variable "Binding" + + In addition to having a type, variables must also be "bound", either via an initializer + expression or to a non-native constraint or rewrite use within the `match` section of the + pattern. "Binding" a variable contextually identifies that variable within either the + input (i.e. `match` section) or output (i.e. `rewrite` section) IR. In the `match` section, + this allows for building the match tree from the pattern's root operation, which must be + "bound" to the [operation rewrite statement](#operation-rewrite-statements) that denotes the + `rewrite` section of the pattern. All non-root variables within the `match` + section must be bound in some way to the "root" operation. To help illustrate + the concept, let's take a look at a quick example. Consider the `.mlir` snippet + below: + + ```mlir + func @baz(%arg: i32) { + %result = my_dialect.foo %arg, %arg -> i32 + } + ``` + + Say that we want to write a pattern that matches `my_dialect.foo` and replaces + it with its unique input argument. A naive way to write this pattern in PDLL is + shown below: + + ```pdll + Pattern { + // ** match section ** // + let arg: Value; + let root = op(arg, arg); + + // ** rewrite section ** // + replace root with arg; + } + ``` + + In the above pattern, the `arg` variable is "bound" to the first and second operands + of the `root` operation. Every use of `arg` is constrained to be the same `Value`, i.e. + the first and second operands of `root` will be constrained to refer to the same input + Value. The same is true for the `root` operation, it is bound to the "root" operation of the + pattern as it is used in input of the top-level [`replace` statement](#replace-statement) + of the `rewrite` section of the pattern. Writing this pattern using the C++ API, the concept + of "binding" becomes more clear: + + ```c++ + struct Pattern : public OpRewritePattern { + LogicalResult matchAndRewrite(my_dialect::FooOp root, PatternRewriter &rewriter) { + Value arg = root->getOperand(0); + if (arg != root->getOperand(1)) + return failure(); + + rewriter.replaceOp(root, arg); + return success(); + } + }; + ``` + + If a variable is not "bound" properly, PDLL won't be able to identify what value + it would correspond to in the IR. As a final example, let's consider a variable + that hasn't been bound: + + ```pdll + Pattern { + // ** match section ** // + let arg: Value; + let root = op + + // ** rewrite section ** // + replace root with arg; + } + ``` + + If we were to write this exact pattern in C++, we would end up with: + + ```c++ + struct Pattern : public OpRewritePattern { + LogicalResult matchAndRewrite(my_dialect::FooOp root, PatternRewriter &rewriter) { + // `arg` was never bound, so we don't know what input Value it was meant to + // correspond to. + Value arg; + + rewriter.replaceOp(root, arg); + return success(); + } + }; + ``` + + #### Variable Constraints + + ```pdll + // This statement defines a variable `value` that is constrained to be a `Value`. + let value: Value; + + // This statement defines a variable `value` that is constrained to be a `Value` + // *and* constrained to have a single use. + let value: [Value, HasOneUse]; + ``` + + Any number of single entity constraints may be attached directly to a variable + upon declaration. Within the `matcher` section, these constraints may add + additional checks on the input IR. Within the `rewriter` section, constraints + are *only* used to define the type of the variable. There are a number of + builtin constraints that correlate to the core MLIR constructs: `Attr`, `Op`, + `Type`, `TypeRange`, `Value`, `ValueRange`. Along with these, users may define + custom constraints that are implemented within PDLL, or natively (i.e. outside + of PDLL). See the general [Constraints](#constraints) section for more detailed + information. + + #### Inline Variable Definition + + Along with the `let` statement, variables may also be defined inline by + specifying the constraint list along with the desired variable name in the first + place that the variable would be used. After definition, the variable is visible + from all points forward. See below for an example: + + ```pdll + // `value` is used as an operand to the operation `root`: + let value: Value; + let root = op(value); + replace root with value; + + // `value` could also be defined "inline": + let root = op(value: Value); + replace root with value; + ``` + + Note that the point of definition of an inline variable is the point of reference, + meaning that an inline variable can be used immediately in the same parent + expression within which it was defined: + + ```pdll + let root = op(value: Value, _: Value, value); + replace root with value; + ``` + + ##### Wildcard Variable Definition + + Often times when defining a variable inline, the variable isn't intended to be + used anywhere else in the pattern. For example, this may happen if you want to + attach constraints to a variable but have no other use for it. In these + situations, the "wildcard" variable can be used to remove the need to provide a + name, as "wildcard" variables are not visible outside of the point of + definition. An example is shown below: + + ```pdll + Pattern { + let root = op(arg: Value, _: Value, _: [Value, I64Value], arg); + replace root with arg; + } + ``` + + In the above example, the second operand isn't needed for the pattern but we + need to provide it to signal that a second operand does exist (we just don't + care what it is in this pattern). + + ### Operation Expression + + An operation expression in PDLL represents an MLIR operation. In the `match` + section of the pattern, this expression models one of the input operations to + the pattern. In the `rewrite` section of the pattern, this expression models one + of the operations to create. The general structure of the operation expression + is very similar to that of the "generic form" of textual MLIR assembly: + + ```pdll + let root = op(operands: ValueRange) {attr = attr: Attr} -> (resultTypes: TypeRange); + ``` + + Let's walk through each of the different components of the expression: + + #### Operation name + + The operation name signifies which type of MLIR Op this operation corresponds + to. In the `match` section of the pattern, the name may be elided. This would + cause this pattern to match *any* operation type that satifies the rest of the + constraints of the operation. In the `rewrite` section, the name is required. + + ```pdll + // `root` corresponds to an instance of a `my_dialect.foo` operation. + let root = op; + + // `root` could be an instance of any operation type. + let root = op<>; + ``` + + #### Operands + + The operands section corresponds to the operands of the operation. This section + of an operation expression may be elided, in which case the operands are not + constrained in any way. When present, the operands of an operation expression + are interpreted in the following ways: + + 1) A single instance of type `ValueRange`: + + In this case, the single range is treated as all of the operands of the + operation: + + ```pdll + // Define an instance with single range of operands. + let root = op(allOperands: ValueRange); + ``` + + 2) A variadic number of either `Value` or `ValueRange`: + + In this case, the inputs are expected to correspond with the operand groups as + defined on the operation in ODS. + + Given the following operation definition in ODS: + + ```tablegen + def MyIndirectCallOp { + let arguments = (ins FunctionType:$call, Variadic:$args); + } + ``` + + We can match the operands as so: + + ```pdll + let root = op(call: Value, args: ValueRange); + ``` + + #### Results + + The results section corresponds to the result types of the operation. This + section of an operation expression may be elided, in which case the result types + are not constrained in any way. When present, the result types of an operation + expression are interpreted in the following ways: + + 1) A single instance of type `TypeRange`: + + In this case, the single range is treated as all of the result types of the + operation: + + ```pdll + // Define an instance with single range of types. + let root = op -> (allResultTypes: TypeRange); + ``` + + 2) A variadic number of either `Type` or `TypeRange`: + + In this case, the inputs are expected to correspond with the result groups as + defined on the operation in ODS. + + Given the following operation definition in ODS: + + ```tablegen + def MyOp { + let results = (outs SomeType:$result, Variadic:$otherResults); + } + ``` + + We can match the result types as so: + + ```pdll + let root = op -> (result: Type, otherResults: TypeRange); + ``` + + #### Attributes + + The attributes section of the operation expression corresponds to the attribute + dictionary of the operation. This section of an operation expression may be + elided, in which case the attributes are not constrained in any way. The + composition of this component maps exactly to how attribute dictionaries are + structured in the MLIR textual assembly format: + + ```pdll + let root = op {attr1 = attrValue: Attr, attr2 = attrValue2: Attr}; + ``` + + Within the `{}` attribute entries are specified by an identifier or string name, + corresponding to the attribute name, followed by an assignment to the attribute + value. If the attribute value is elided, the value of the attribute is + implicitly defined as a + [`UnitAttr`](https://mlir.llvm.org/docs/Dialects/Builtin/#unitattr). + + ```pdll + let unitConstant = op {value}; + ``` + + ##### Accessing Operation Results + + In multi-operation patterns, the result of one operation often feeds as an input + into another. The result groups of an operation may be accessed by name or by + index via the `.` operator: + + Note: Remember to import the definition of your operation via + [include](#`.td`_includes) to ensure it is visible to PDLL. + + Given the following operation definition in ODS: + + ```tablegen + def MyResultOp { + let results = (outs SomeType:$result); + } + def MyInputOp { + let arguments = (ins SomeType:$input, SomeType:$input); + } + ``` + + We can write a pattern where `MyResultOp` feeds into `MyInputOp` as so: + + ```pdll + // In this example, we use both `result`(the name) and `0`(the index) to refer to + // the first result group of `resultOp`. + // Note: If we elide the result types section within the match section, it means + // they aren't constrained, not that the operation has no results. + let resultOp = op; + let inputOp = op(resultOp.result, resultOp.0); + ``` + + Along with result name access, variables of `Op` type may implicitly convert to + `Value` or `ValueRange`. These variables are converted to `Value` when they are + known (via ODS) to only have one result, in all other cases they convert to + `ValueRange`: + + ```pdll + // `resultOp` may also convert implicitly to a Value for use in `inputOp`: + let resultOp = op; + let inputOp = op(resultOp); + + // We could also inline `resultOp` directly: + let inputOp = op(op); + ``` + + ### Attribute Expression + + An attribute expression represents a literal MLIR attribute. It allows for + statically specifying an MLIR attribute to use, by specifying the textual form + of that attribute. + + ```pdll + let trueConstant = op {value = attr<"true">}; + + let applyResult = op(args: ValueRange) {map = attr<"affine_map<(d0, d1) -> (d1 - 3)>">} + ``` + + ### Type Expression + + A type expression represents a literal MLIR type. It allows for statically + specifying an MLIR type to use, by specifying the textual form of that type. + + ```pdll + let i32Constant = op -> (type<"i32">); + ``` + + ### Tuples + + PDLL provides native support for tuples, which are used to group multiple + elements into a single compound value. The values in a tuple can be of any type, + and do not need to be of the same type. There is also no limit to the number of + elements held by a tuple. The elements of a tuple can be accessed by index: + + ```pdll + let tupleValue = (op, attr<"10 : i32">, type<"i32">); + + let opValue = tupleValue.0; + let attrValue = tupleValue.1; + let typeValue = tupleValue.2; + ``` + + You can also name the elements of a tuple and use those names to refer to the + values of the individual elements. An element name consists of an identifier + followed immediately by an equal (=). + + ```pdll + let tupleValue = ( + opValue = op, + attr<"10 : i32">, + typeValue = type<"i32"> + ); + + let opValue = tupleValue.opValue; + let attrValue = tupleValue.1; + let typeValue = tupleValue.typeValue; + ``` + + Tuples are used to represent multiple results from a + [constraint](#constraints-with-multiple-results) or + [rewrite](#rewrites-with-multiple-results). + + ### Constraints + + Constraints provide the ability to inject additional checks on the input IR + within the `match` section of a pattern. Constraints can be applied anywhere + within the `match` section, and depending on the type can either be applied via + the constraint list of a [variable](#variables) or via the call operator (e.g. + `MyConstraint(...)`). There are three main categories of constraints: + + #### Core Constraints + + PDLL defines a number of core constraints that constrain the type of the IR + entity. These constraints can only be applied via the + [constraint list](#variable-constraints) of a variable. + + * `Attr` (`<` type `>`)? + + A single entity constraint that corresponds to an `mlir::Attribute`. This + constraint optionally takes a type component that constrains the result type of + the attribute. + + ```pdll + // Define a simple variable using the `Attr` constraint. + let attr: Attr; + let constant = op {value = attr}; + + // Define a simple variable using the `Attr` constraint, that has its type + // constrained as well. + let attrType: Type; + let attr: Attr; + let constant = op {value = attr}; + ``` + + * `Op` (`<` op-name `>`)? + + A single entity constraint that corresponds to an `mlir::Operation *`. + + ```pdll + // Match only when the input is from another operation. + let inputOp: Op; + let root = op(inputOp); + + // Match only when the input is from another `my_dialect.foo` operation. + let inputOp: Op; + let root = op(inputOp); + ``` + + * `Type` + + A single entity constraint that corresponds to an `mlir::Type`. + + ```pdll + // Define a simple variable using the `Type` constraint. + let resultType: Type; + let root = op -> (resultType); + ``` + + * `TypeRange` + + A single entity constraint that corresponds to a `mlir::TypeRange`. + + ```pdll + // Define a simple variable using the `TypeRange` constraint. + let resultTypes: TypeRange; + let root = op -> (resultTypes); + ``` + + * `Value` (`<` type-expr `>`)? + + A single entity constraint that corresponds to an `mlir::Value`. This constraint + optionally takes a type component that constrains the result type of the value. + + ```pdll + // Define a simple variable using the `Value` constraint. + let value: Value; + let root = op(value); + + // Define a variable using the `Value` constraint, that has its type constrained + // to be same as the result type of the `root` op. + let valueType: Type; + let input: Value; + let root = op(input) -> (valueType); + ``` + + * `ValueRange` (`<` type-expr `>`)? + + A single entity constraint that corresponds to a `mlir::ValueRange`. This + constraint optionally takes a type component that constrains the result types of + the value range. + + ```pdll + // Define a simple variable using the `ValueRange` constraint. + let inputs: ValueRange; + let root = op(inputs); + + // Define a variable using the `ValueRange` constraint, that has its types + // constrained to be same as the result types of the `root` op. + let valueTypes: TypeRange; + let inputs: ValueRange; + let root = op(inputs) -> (valueTypes); + ``` + + #### Defining Constraints in PDLL + + Aside from the core constraints, additional constraints can also be defined + within PDLL. This allows for building matcher fragments that can be composed + across many different patterns. A constraint in PDLL is defined similarly to a + function in traditional programming languages; it contains a name, a set of + input arguments, a set of result types, and a body. Results of a constraint are + returned via a `return` statement. A few examples are shown below: + + ```pdll + /// A constraint that takes an input and constrains the use to an operation of + /// a given type. + Constraint UsedByFooOp(value: Value) { + op(value); + } + + /// A constraint that returns a result of an existing operation. + Constraint ExtractResult(op: Op) -> Value { + return op.result; + } + + Pattern { + let value = ExtractResult(op); + UsedByFooOp(value); + } + ``` + + ##### Constraints with multiple results + + Constraints can return multiple results by returning a tuple of values. When + returning multiple results, each result can also be assigned a name to use when + indexing that tuple element. Tuple elements can be referenced by their index + number, or by name if they were assigned one. + + ```pdll + // A constraint that returns multiple results, with some of the results assigned + // a more readable name. + Constraint ExtractMultipleResults(op: Op) -> (Value, result1: Value) { + return (op.result1, op.result2); + } + + Pattern { + // Return a tuple of values. + let result = ExtractMultipleResults(op: op); + + // Index the tuple elements by index, or by name. + replace op with (result.0, result.1, result.result1); + } + ``` + + ##### Constraint result type inference + + In addition to explicitly specifying the results of the constraint via the + constraint signature, PDLL defined constraints also support inferring the result + type from the return statement. Result type inference is active whenever the + constraint is defined with no result constraints: + + ```pdll + // This constraint returns a derived operation. + Constraint ReturnSelf(op: Op) { + return op; + } + // This constraint returns a tuple of two Values. + Constraint ExtractMultipleResults(op: Op) { + return (result1 = op.result1, result2 = op.result2); + } + + Pattern { + let values = ExtractMultipleResults(op); + replace op with (values.result1, values.result2); + } + ``` + + ##### Single Line "Lambda" Body + + Constraints generally define their body using a compound block of statements, as + shown below: + + ```pdll + Constraint ReturnSelf(op: Op) { + return op; + } + Constraint ExtractMultipleResults(op: Op) { + return (result1 = op.result1, result2 = op.result2); + } + ``` + + Constraints also support a lambda-like syntax for specifying simple single line + bodies. The lambda body of a Constraint expects a single expression, which is + implicitly returned: + + ```pdll + Constraint ReturnSelf(op: Op) => op; + + Constraint ExtractMultipleResults(op: Op) + => (result1 = op.result1, result2 = op.result2); + ``` + + #### Native Constraints + + Constraints may also be defined outside of PDLL, and registered natively within + the C++ API. + + ##### Importing existing Native Constraints + + Constraints defined externally can be imported into PDLL by specifying a + constraint "declaration". This is similar to the PDLL form of defining a + constraint but omits the body. Importing the declaration in this form allows for + PDLL to statically know the expected input and output types. + + ```pdll + // Import a single entity value native constraint that checks if the value has a + // single use. This constraint must be registered by the consumer of the + // compiled PDL. + Constraint HasOneUse(value: Value); + + // Import a multi-entity type constraint that checks if two values have the same + // element type. + Constraint HasSameElementType(value1: Value, value2: Value); + + Pattern { + // A single entity constraint can be applied via the variable argument list. + let value: HasOneUse; + + // Otherwise, constraints can be applied via the call operator: + let value: Value = ...; + let value2: Value = ...; + HasOneUse(value); + HasSameElementType(value, value2); + } + ``` + + External constraints are those registered explicitly with the `RewritePatternSet` via + the C++ PDL API. For example, the constraints above may be registered as: + + ```c++ + // TODO: Cleanup when we allow more accessible wrappers around PDL functions. + static LogicalResult hasOneUseImpl(PDLValue pdlValue, PatternRewriter &rewriter) { + Value value = pdlValue.cast(); + + return success(value.hasOneUse()); + } + static LogicalResult hasSameElementTypeImpl(ArrayRef pdlValues, + PatternRewriter &rewriter) { + Value value1 = pdlValues[0].cast(); + Value value2 = pdlValues[1].cast(); + + return success(value1.getType().cast().getElementType() == + value2.getType().cast().getElementType()); + } + + void registerNativeConstraints(RewritePatternSet &patterns) { + patternList.getPDLPatterns().registerConstraintFunction( + "HasOneUse", hasOneUseImpl); + patternList.getPDLPatterns().registerConstraintFunction( + "HasSameElementType", hasSameElementTypeImpl); + } + ``` + + ##### Defining Native Constraints in PDLL + + In addition to importing native constraints, PDLL also supports defining native + constraints directly when compiling ahead-of-time (AOT) for C++. These + constraints can be defined by specifying a string code block after the + constraint declaration: + + ```pdll + Constraint HasOneUse(value: Value) [{ + return success(value.hasOneUse()); + }]; + Constraint HasSameElementType(value1: Value, value2: Value) [{ + return success(value1.getType().cast().getElementType() == + value2.getType().cast().getElementType()); + }]; + + Pattern { + // A single entity constraint can be applied via the variable argument list. + let value: HasOneUse; + + // Otherwise, constraints can be applied via the call operator: + let value: Value = ...; + let value2: Value = ...; + HasOneUse(value); + HasSameElementType(value, value2); + } + ``` + + The arguments of the constraint are accessible within the code block via the + same name. The type of these native variables are mapped directly to the + corresponding MLIR type of the [core constraint](#core-constraints) used. For + example, an `Op` corresponds to a variable of type `Operation *`. + + The results of the constraint can be populated using the provided `results` + variable. This variable is a `PDLResultList`, and expects results to be + populated in the order that they are defined within the result list of the + constraint declaration. + + In addition to the above, the code block may also access the current + `PatternRewriter` using `rewriter`. + + #### Defining Constraints Inline + + In addition to global scope, PDLL Constraints and Native Constraints defined in + PDLL may be specified *inline* at any level of nesting. This means that they may + be defined in Patterns, other Constraints, Rewrites, etc: + + ```pdll + Constraint GlobalConstraint() { + Constraint LocalConstraint(value: Value) { + ... + }; + Constraint LocalNativeConstraint(value: Value) [{ + ... + }]; + let someValue: [LocalConstraint, LocalNativeConstraint] = ...; + } + ``` + + Constraints that are defined inline may also elide the name when used directly: + + ```pdll + Constraint GlobalConstraint(inputValue: Value) { + Constraint(value: Value) { ... }(inputValue); + Constraint(value: Value) [{ ... }](inputValue); + } + ``` + + When defined inline, PDLL constraints may reference any previously defined + variable: + + ```pdll + Constraint GlobalConstraint(op: Op) { + Constraint LocalConstraint() { + let results = op.results; + }; + } + ``` + + ### Rewriters + + Rewriters define the set of transformations to be performed within the `rewrite` + section of a pattern, and, more specifically, how to transform the input IR + after a successful pattern match. All PDLL rewrites must be defined within the + `rewrite` section of the pattern. The `rewrite` section is denoted by the last + statement within the body of the `Pattern`, which is required to be an + [operation rewrite statement](#operation-rewrite-statements). There are two main + categories of rewrites in PDLL: operation rewrite statements, and user defined + rewrites. + + #### Operation Rewrite statements + + Operation rewrite statements are builtin PDLL statements that perform an IR + transformation given a root operation. These statements are the only ones able + to start the `rewrite` section of a pattern, as they allow for properly + ["binding"](#variable-binding) the root operation of the pattern. + + ##### `erase` statement + + ```pdll + // A pattern that erases all `my_dialect.foo` operations. + Pattern => erase op; + ``` + + The `erase` statement erases a given operation. + + ##### `replace` statement + + ```pdll + // A pattern that replaces the root operation with its input value. + Pattern { + let root = op(input: Value); + replace root with input; + } + + // A pattern that replaces the root operation with multiple input values. + Pattern { + let root = op(input: Value, _: Value, input2: Value); + replace root with (input, input2); + } + + // A pattern that replaces the root operation with another operation. + // Note that when an operation is used as the replacement, we can infer its + // result types from the input operation. In these cases, the result + // types of replacement operation may be elided. + Pattern { + // Note: In this pattern we also inlined the `root` expression. + replace op with op; + } + ``` + + The `replace` statement allows for replacing a given root operation with either + another operation, or a set of input `Value` and `ValueRange` values. When an operation + is used as the replacement, we allow infering the result types from the input operation. + In these cases, the result types of replacement operation may be elided. Note that no + other components aside from the result types will be inferred from the input operation + during the replacement. + + ##### `rewrite` statement + + ```pdll + // A simple pattern that replaces the root operation with its input value. + Pattern { + let root = op(input: Value); + rewrite root with { + ... + + replace root with input; + }; + } + ``` + + The `rewrite` statement allows for rewriting a given root operation with a block + of nested rewriters. The root operation is not implicitly erased or replaced, + and any transformations to it must be expressed within the nested rewrite block. + The inner body may contain any number of other rewrite statements, variables, or + expressions. + + #### Defining Rewriters in PDLL + + Additional rewrites can also be defined within PDLL, which allows for building + rewrite fragments that can be composed across many different patterns. A + rewriter in PDLL is defined similarly to a function in traditional programming + languages; it contains a name, a set of input arguments, a set of result types, + and a body. Results of a rewrite are returned via a `return` statement. A few + examples are shown below: + + ```pdll + // A rewrite that constructs and returns a new operation, given an input value. + Rewrite BuildFooOp(value: Value) -> Op { + return op(value); + } + + Pattern { + // We invoke the rewrite in the same way as functions in traditional + // languages. + replace op(input: Value) with BuildFooOp(input); + } + ``` + + ##### Rewrites with multiple results + + Rewrites can return multiple results by returning a tuple of values. When + returning multiple results, each result can also be assigned a name to use when + indexing that tuple element. Tuple elements can be referenced by their index + number, or by name if they were assigned one. + + ```pdll + // A rewrite that returns multiple results, with some of the results assigned + // a more readable name. + Rewrite CreateRewriteOps() -> (Op, result1: ValueRange) { + return (op, op); + } + + Pattern { + rewrite root: Op with { + // Invoke the rewrite, which returns a tuple of values. + let result = CreateRewriteOps(); + + // Index the tuple elements by index, or by name. + replace root with (result.0, result.1, result.result1); + } + } + ``` + + ##### Rewrite result type inference + + In addition to explicitly specifying the results of the rewrite via the rewrite + signature, PDLL defined rewrites also support inferring the result type from the + return statement. Result type inference is active whenever the rewrite is + defined with no result constraints: + + ```pdll + // This rewrite returns a derived operation. + Rewrite ReturnSelf(op: Op) => op; + // This rewrite returns a tuple of two Values. + Rewrite ExtractMultipleResults(op: Op) { + return (result1 = op.result1, result2 = op.result2); + } + + Pattern { + rewrite root: Op with { + let values = ExtractMultipleResults(op); + replace root with (values.result1, values.result2); + } + } + ``` + + ##### Single Line "Lambda" Body + + Rewrites generally define their body using a compound block of statements, as + shown below: + + ```pdll + Rewrite ReturnSelf(op: Op) { + return op; + } + Rewrite EraseOp(op: Op) { + erase op; + } + ``` + + Rewrites also support a lambda-like syntax for specifying simple single line + bodies. The lambda body of a Rewrite expects a single expression, which is + implicitly returned, or a single + [operation rewrite statement](#operation-rewrite-statements): + + ```pdll + Rewrite ReturnSelf(op: Op) => op; + Rewrite EraseOp(op: Op) => erase op; + ``` + + #### Native Rewriters + + Rewriters may also be defined outside of PDLL, and registered natively within + the C++ API. + + ##### Importing existing Native Rewrites + + Rewrites defined externally can be imported into PDLL by specifying a + rewrite "declaration". This is similar to the PDLL form of defining a + rewrite but omits the body. Importing the declaration in this form allows for + PDLL to statically know the expected input and output types. + + ```pdll + // Import a single input native rewrite that returns a new operation. This + // rewrite must be registered by the consumer of the compiled PDL. + Rewrite BuildOp(value: Value) -> Op; + + Pattern { + replace op(input: Value) with BuildOp(input); + } + ``` + + External rewrites are those registered explicitly with the `RewritePatternSet` via + the C++ PDL API. For example, the rewrite above may be registered as: + + ```c++ + // TODO: Cleanup when we allow more accessible wrappers around PDL functions. + static void buildOpImpl(ArrayRef args, PatternRewriter &rewriter, + PDLResultList &results) { + Value value = args[0].cast(); + + // insert special rewrite logic here. + Operation *resultOp = ...; + results.push_back(resultOp); + } + + void registerNativeRewrite(RewritePatternSet &patterns) { + patterns.getPDLPatterns().registerRewriteFunction("BuildOp", buildOpImpl); + } + ``` + + ##### Defining Native Rewrites in PDLL + + In addition to importing native rewrites, PDLL also supports defining native + rewrites directly when compiling ahead-of-time (AOT) for C++. These rewrites can + be defined by specifying a string code block after the rewrite declaration: + + ```pdll + Rewrite BuildOp(value: Value) -> (foo: Op, bar: Op) [{ + // We push back the results into the `results` variable in the order defined + // by the result list of the rewrite declaration. + results.push_back(rewriter.create(value)); + results.push_back(rewriter.create()); + }]; + + Pattern { + let root = op(input: Value); + rewrite root with { + // Invoke the native rewrite and use the results when replacing the root. + let results = BuildOp(input); + replace root with (results.foo, results.bar); + } + } + ``` + + The arguments of the rewrite are accessible within the code block via the + same name. The type of these native variables are mapped directly to the + corresponding MLIR type of the [core constraint](#core-constraints) used. For + example, an `Op` corresponds to a variable of type `Operation *`. + + The results of the rewrite can be populated using the provided `results` + variable. This variable is a `PDLResultList`, and expects results to be + populated in the order that they are defined within the result list of the + rewrite declaration. + + In addition to the above, the code block may also access the current + `PatternRewriter` using `rewriter`. + + #### Defining Rewrites Inline + + In addition to global scope, PDLL Rewrites and Native Rewrites defined in PDLL + may be specified *inline* at any level of nesting. This means that they may be + defined in Patterns, other Rewrites, etc: + + ```pdll + Rewrite GlobalRewrite(inputValue: Value) { + Rewrite localRewrite(value: Value) { + ... + }; + Rewrite localNativeRewrite(value: Value) [{ + ... + }]; + localRewrite(inputValue); + localNativeRewrite(inputValue); + } + ``` + + Rewrites that are defined inline may also elide the name when used directly: + + ```pdll + Rewrite GlobalRewrite(inputValue: Value) { + Rewrite(value: Value) { ... }(inputValue); + Rewrite(value: Value) [{ ... }](inputValue); + } + ``` + + When defined inline, PDLL rewrites may reference any previously defined + variable: + + ```pdll + Rewrite GlobalRewrite(op: Op) { + Rewrite localRewrite() { + let results = op.results; + }; + } + ``` diff --git a/mlir/docs/PassManagement.md b/mlir/docs/PassManagement.md --- a/mlir/docs/PassManagement.md +++ b/mlir/docs/PassManagement.md @@ -128,7 +128,7 @@ (operations, types, attributes, ...) can be created. Dialects must also be loaded before starting the execution of a multi-threaded pass pipeline. To this end, a pass that may create an entity from a dialect that isn't guaranteed to -already ne loaded must express this by overriding the `getDependentDialects()` +already be loaded must express this by overriding the `getDependentDialects()` method and declare this list of Dialects explicitly. ### Initialization @@ -818,7 +818,7 @@ contains the following fields: * `summary` - - A short one line summary of the pass, used as the description when + - A short one-line summary of the pass, used as the description when registering the pass. * `description` - A longer, more detailed description of the pass. This is used when @@ -847,7 +847,7 @@ * default value - The default option value. * description - - A one line description of the option. + - A one-line description of the option. * additional option flags - A string containing any additional options necessary to construct the option. @@ -870,7 +870,7 @@ * element type - The C++ type of the list element. * description - - A one line description of the option. + - A one-line description of the option. * additional option flags - A string containing any additional options necessary to construct the option. @@ -894,7 +894,7 @@ * display name - The name used when displaying the statistic. * description - - A one line description of the statistic. + - A one-line description of the statistic. ```tablegen def MyPass : Pass<"my-pass"> { @@ -938,7 +938,7 @@ Instrumentations added to the PassManager are run in a stack like fashion, i.e. the last instrumentation to execute a `runBefore*` hook will be the first to execute the respective `runAfter*` hook. The hooks of a `PassInstrumentation` -class are guaranteed to be executed in a thread safe fashion, so additional +class are guaranteed to be executed in a thread-safe fashion, so additional synchronization is not necessary. Below in an example instrumentation that counts the number of times the `DominanceInfo` analysis is computed: diff --git a/mlir/docs/PassManagement.md.rej b/mlir/docs/PassManagement.md.rej new file mode 100644 --- /dev/null +++ b/mlir/docs/PassManagement.md.rej @@ -0,0 +1,1246 @@ +diff a/mlir/docs/PassManagement.md b/mlir/docs/PassManagement.md (rejected hunks) +@@ -1,1238 +1,1238 @@ + # Pass Infrastructure + + [TOC] + + Passes represent the basic infrastructure for transformation and optimization. + This document provides an overview of the pass infrastructure in MLIR and how to + use it. + + See [MLIR specification](LangRef.md) for more information about MLIR and its + core aspects, such as the IR structure and operations. + + See [MLIR Rewrites](Tutorials/QuickstartRewrites.md) for a quick start on graph + rewriting in MLIR. If a transformation involves pattern matching operation DAGs, + this is a great place to start. + + ## Operation Pass + + In MLIR, the main unit of abstraction and transformation is an + [operation](LangRef.md/#operations). As such, the pass manager is designed to + work on instances of operations at different levels of nesting. The structure of + the [pass manager](#pass-manager), and the concept of nesting, is detailed + further below. All passes in MLIR derive from `OperationPass` and adhere to the + following restrictions; any noncompliance will lead to problematic behavior in + multithreaded and other advanced scenarios: + + * Must not modify any state referenced or relied upon outside the current + being operated on. This includes adding or removing operations from the + parent block, changing the attributes(depending on the contract of the + current operation)/operands/results/successors of the current operation. + * Must not modify the state of another operation not nested within the current + operation being operated on. + * Other threads may be operating on these operations simultaneously. + * Must not inspect the state of sibling operations. + * Other threads may be modifying these operations in parallel. + * Inspecting the state of ancestor/parent operations is permitted. + * Must not maintain mutable pass state across invocations of `runOnOperation`. + A pass may be run on many different operations with no guarantee of + execution order. + * When multithreading, a specific pass instance may not even execute on + all operations within the IR. As such, a pass should not rely on running + on all operations. + * Must not maintain any global mutable state, e.g. static variables within the + source file. All mutable state should be maintained by an instance of the + pass. + * Must be copy-constructible + * Multiple instances of the pass may be created by the pass manager to + process operations in parallel. + + When creating an operation pass, there are two different types to choose from + depending on the usage scenario: + + ### OperationPass : Op-Specific + + An `op-specific` operation pass operates explicitly on a given operation type. + This operation type must adhere to the restrictions set by the pass manager for + pass execution. + + To define an op-specific operation pass, a derived class must adhere to the + following: + + * Inherit from the CRTP class `OperationPass` and provide the operation type + as an additional template parameter. + * Override the virtual `void runOnOperation()` method. + + A simple pass may look like: + + ```c++ + namespace { + /// Here we utilize the CRTP `PassWrapper` utility class to provide some + /// necessary utility hooks. This is only necessary for passes defined directly + /// in C++. Passes defined declaratively use a cleaner mechanism for providing + /// these utilities. + struct MyFunctionPass : public PassWrapper> { + void runOnOperation() override { + // Get the current FuncOp operation being operated on. + FuncOp f = getOperation(); + + // Walk the operations within the function. + f.walk([](Operation *inst) { + .... + }); + } + }; + } // namespace + + /// Register this pass so that it can be built via from a textual pass pipeline. + /// (Pass registration is discussed more below) + void registerMyPass() { + PassRegistration(); + } + ``` + + ### OperationPass : Op-Agnostic + + An `op-agnostic` pass operates on the operation type of the pass manager that it + is added to. This means that passes of this type may operate on several + different operation types. Passes of this type are generally written generically + using operation [interfaces](Interfaces.md) and [traits](Traits.md). Examples of + this type of pass are + [Common Sub-Expression Elimination](Passes.md/#-cse-eliminate-common-sub-expressions) + and [Inlining](Passes.md/#-inline-inline-function-calls). + + To create an operation pass, a derived class must adhere to the following: + + * Inherit from the CRTP class `OperationPass`. + * Override the virtual `void runOnOperation()` method. + + A simple pass may look like: + + ```c++ + /// Here we utilize the CRTP `PassWrapper` utility class to provide some + /// necessary utility hooks. This is only necessary for passes defined directly + /// in C++. Passes defined declaratively use a cleaner mechanism for providing + /// these utilities. + struct MyOperationPass : public PassWrapper> { + void runOnOperation() override { + // Get the current operation being operated on. + Operation *op = getOperation(); + ... + } + }; + ``` + + ### Dependent Dialects + + Dialects must be loaded in the MLIRContext before entities from these dialects + (operations, types, attributes, ...) can be created. Dialects must also be + loaded before starting the execution of a multi-threaded pass pipeline. To this + end, a pass that may create an entity from a dialect that isn't guaranteed to +-already ne loaded must express this by overriding the `getDependentDialects()` ++already be loaded must express this by overriding the `getDependentDialects()` + method and declare this list of Dialects explicitly. + + ### Initialization + + In certain situations, a Pass may contain state that is constructed dynamically, + but is potentially expensive to recompute in successive runs of the Pass. One + such example is when using [`PDL`-based](Dialects/PDLOps.md) + [patterns](PatternRewriter.md), which are compiled into a bytecode during + runtime. In these situations, a pass may override the following hook to + initialize this heavy state: + + * `LogicalResult initialize(MLIRContext *context)` + + This hook is executed once per run of a full pass pipeline, meaning that it does + not have access to the state available during a `runOnOperation` call. More + concretely, all necessary accesses to an `MLIRContext` should be driven via the + provided `context` parameter, and methods that utilize "per-run" state such as + `getContext`/`getOperation`/`getAnalysis`/etc. must not be used. + In case of an error during initialization, the pass is expected to emit an error + diagnostic and return a `failure()` which will abort the pass pipeline execution. + + ## Analysis Management + + An important concept, along with transformation passes, are analyses. These are + conceptually similar to transformation passes, except that they compute + information on a specific operation without modifying it. In MLIR, analyses are + not passes but free-standing classes that are computed lazily on-demand and + cached to avoid unnecessary recomputation. An analysis in MLIR must adhere to + the following: + + * Provide a valid constructor taking either an `Operation*` or `Operation*` + and `AnalysisManager &`. + * The provided `AnalysisManager &` should be used to query any necessary + analysis dependencies. + * Must not modify the given operation. + + An analysis may provide additional hooks to control various behavior: + + * `bool isInvalidated(const AnalysisManager::PreservedAnalyses &)` + + Given a preserved analysis set, the analysis returns true if it should truly be + invalidated. This allows for more fine-tuned invalidation in cases where an + analysis wasn't explicitly marked preserved, but may be preserved (or + invalidated) based upon other properties such as analyses sets. If the analysis + uses any other analysis as a dependency, it must also check if the dependency + was invalidated. + + ### Querying Analyses + + The base `OperationPass` class provides utilities for querying and preserving + analyses for the current operation being processed. + + * OperationPass automatically provides the following utilities for querying + analyses: + * `getAnalysis<>` + - Get an analysis for the current operation, constructing it if + necessary. + * `getCachedAnalysis<>` + - Get an analysis for the current operation, if it already exists. + * `getCachedParentAnalysis<>` + - Get an analysis for a given parent operation, if it exists. + * `getCachedChildAnalysis<>` + - Get an analysis for a given child operation, if it exists. + * `getChildAnalysis<>` + - Get an analysis for a given child operation, constructing it if + necessary. + + Using the example passes defined above, let's see some examples: + + ```c++ + /// An interesting analysis. + struct MyOperationAnalysis { + // Compute this analysis with the provided operation. + MyOperationAnalysis(Operation *op); + }; + + struct MyOperationAnalysisWithDependency { + MyOperationAnalysisWithDependency(Operation *op, AnalysisManager &am) { + // Request other analysis as dependency + MyOperationAnalysis &otherAnalysis = am.getAnalysis(); + ... + } + + bool isInvalidated(const AnalysisManager::PreservedAnalyses &pa) { + // Check if analysis or its dependency were invalidated + return !pa.isPreserved() || + !pa.isPreserved(); + } + }; + + void MyOperationPass::runOnOperation() { + // Query MyOperationAnalysis for the current operation. + MyOperationAnalysis &myAnalysis = getAnalysis(); + + // Query a cached instance of MyOperationAnalysis for the current operation. + // It will not be computed if it doesn't exist. + auto optionalAnalysis = getCachedAnalysis(); + if (optionalAnalysis) + ... + + // Query a cached instance of MyOperationAnalysis for the parent operation of + // the current operation. It will not be computed if it doesn't exist. + auto optionalAnalysis = getCachedParentAnalysis(); + if (optionalAnalysis) + ... + } + ``` + + ### Preserving Analyses + + Analyses that are constructed after being queried by a pass are cached to avoid + unnecessary computation if they are requested again later. To avoid stale + analyses, all analyses are assumed to be invalidated by a pass. To avoid + invalidation, a pass must specifically mark analyses that are known to be + preserved. + + * All Pass classes automatically provide the following utilities for + preserving analyses: + * `markAllAnalysesPreserved` + * `markAnalysesPreserved<>` + + ```c++ + void MyOperationPass::runOnOperation() { + // Mark all analyses as preserved. This is useful if a pass can guarantee + // that no transformation was performed. + markAllAnalysesPreserved(); + + // Mark specific analyses as preserved. This is used if some transformation + // was performed, but some analyses were either unaffected or explicitly + // preserved. + markAnalysesPreserved(); + } + ``` + + ## Pass Failure + + Passes in MLIR are allowed to gracefully fail. This may happen if some invariant + of the pass was broken, potentially leaving the IR in some invalid state. If + such a situation occurs, the pass can directly signal a failure to the pass + manager via the `signalPassFailure` method. If a pass signaled a failure when + executing, no other passes in the pipeline will execute and the top-level call + to `PassManager::run` will return `failure`. + + ```c++ + void MyOperationPass::runOnOperation() { + // Signal failure on a broken invariant. + if (some_broken_invariant) + return signalPassFailure(); + } + ``` + + ## Pass Manager + + The above sections introduced the different types of passes and their + invariants. This section introduces the concept of a PassManager, and how it can + be used to configure and schedule a pass pipeline. There are two main classes + related to pass management, the `PassManager` and the `OpPassManager`. The + `PassManager` class acts as the top-level entry point, and contains various + configurations used for the entire pass pipeline. The `OpPassManager` class is + used to schedule passes to run at a specific level of nesting. The top-level + `PassManager` also functions as an `OpPassManager`. + + ### OpPassManager + + An `OpPassManager` is essentially a collection of passes to execute on an + operation of a specific type. This operation type must adhere to the following + requirement: + + * Must be registered and marked + [`IsolatedFromAbove`](Traits.md/#isolatedfromabove). + + * Passes are expected to not modify operations at or above the current + operation being processed. If the operation is not isolated, it may + inadvertently modify or traverse the SSA use-list of an operation it is + not supposed to. + + Passes can be added to a pass manager via `addPass`. The pass must either be an + `op-specific` pass operating on the same operation type as `OpPassManager`, or + an `op-agnostic` pass. + + An `OpPassManager` is generally created by explicitly nesting a pipeline within + another existing `OpPassManager` via the `nest<>` method. This method takes the + operation type that the nested pass manager will operate on. At the top-level, a + `PassManager` acts as an `OpPassManager`. Nesting in this sense, corresponds to + the [structural](Tutorials/UnderstandingTheIRStructure.md) nesting within + [Regions](LangRef.md/#regions) of the IR. + + For example, the following `.mlir`: + + ``` + module { + spv.module "Logical" "GLSL450" { + func @foo() { + ... + } + } + } + ``` + + Has the nesting structure of: + + ``` + `module` + `spv.module` + `function` + ``` + + Below is an example of constructing a pipeline that operates on the above + structure: + + ```c++ + // Create a top-level `PassManager` class. If an operation type is not + // explicitly specific, the default is the builtin `module` operation. + PassManager pm(ctx); + // Note: We could also create the above `PassManager` this way. + PassManager pm(ctx, /*operationName=*/"builtin.module"); + + // Add a pass on the top-level module operation. + pm.addPass(std::make_unique()); + + // Nest a pass manager that operates on `spirv.module` operations nested + // directly under the top-level module. + OpPassManager &nestedModulePM = pm.nest(); + nestedModulePM.addPass(std::make_unique()); + + // Nest a pass manager that operates on functions within the nested SPIRV + // module. + OpPassManager &nestedFunctionPM = nestedModulePM.nest(); + nestedFunctionPM.addPass(std::make_unique()); + + // Run the pass manager on the top-level module. + ModuleOp m = ...; + if (failed(pm.run(m))) + ... // One of the passes signaled a failure. + ``` + + The above pass manager contains the following pipeline structure: + + ``` + OpPassManager + MyModulePass + OpPassManager + MySPIRVModulePass + OpPassManager + MyFunctionPass + ``` + + These pipelines are then run over a single operation at a time. This means that, + for example, given a series of consecutive passes on FuncOp, it will execute all + on the first function, then all on the second function, etc. until the entire + program has been run through the passes. This provides several benefits: + + * This improves the cache behavior of the compiler, because it is only + touching a single function at a time, instead of traversing the entire + program. + * This improves multi-threading performance by reducing the number of jobs + that need to be scheduled, as well as increasing the efficiency of each job. + An entire function pipeline can be run on each function asynchronously. + + ## Dynamic Pass Pipelines + + In some situations it may be useful to run a pass pipeline within another pass, + to allow configuring or filtering based on some invariants of the current + operation being operated on. For example, the + [Inliner Pass](Passes.md/#-inline-inline-function-calls) may want to run + intraprocedural simplification passes while it is inlining to produce a better + cost model, and provide more optimal inlining. To enable this, passes may run an + arbitrary `OpPassManager` on the current operation being operated on or any + operation nested within the current operation via the `LogicalResult + Pass::runPipeline(OpPassManager &, Operation *)` method. This method returns + whether the dynamic pipeline succeeded or failed, similarly to the result of the + top-level `PassManager::run` method. A simple example is shown below: + + ```c++ + void MyModulePass::runOnOperation() { + ModuleOp module = getOperation(); + if (hasSomeSpecificProperty(module)) { + OpPassManager dynamicPM("builtin.module"); + ...; // Build the dynamic pipeline. + if (failed(runPipeline(dynamicPM, module))) + return signalPassFailure(); + } + } + ``` + + Note: though above the dynamic pipeline was constructed within the + `runOnOperation` method, this is not necessary and pipelines should be cached + when possible as the `OpPassManager` class can be safely copy constructed. + + The mechanism described in this section should be used whenever a pass pipeline + should run in a nested fashion, i.e. when the nested pipeline cannot be + scheduled statically along with the rest of the main pass pipeline. More + specifically, a `PassManager` should generally never need to be constructed + within a `Pass`. Using `runPipeline` also ensures that all analyses, + [instrumentations](#pass-instrumentation), and other pass manager related + components are integrated with the dynamic pipeline being executed. + + ## Instance Specific Pass Options + + MLIR provides a builtin mechanism for passes to specify options that configure + its behavior. These options are parsed at pass construction time independently + for each instance of the pass. Options are defined using the `Option<>` and + `ListOption<>` classes, and generally follow the + [LLVM command line](https://llvm.org/docs/CommandLine.html) flag definition + rules. One major distinction from the LLVM command line functionality is that + all `ListOption`s are comma-separated, and delimited sub-ranges within individual + elements of the list may contain commas that are not treated as separators for the + top-level list. + + ```c++ + struct MyPass ... { + /// Make sure that we have a valid default constructor and copy constructor to + /// ensure that the options are initialized properly. + MyPass() = default; + MyPass(const MyPass& pass) {} + + /// Any parameters after the description are forwarded to llvm::cl::list and + /// llvm::cl::opt respectively. + Option exampleOption{*this, "flag-name", llvm::cl::desc("...")}; + ListOption exampleListOption{*this, "list-flag-name", llvm::cl::desc("...")}; + }; + ``` + + For pass pipelines, the `PassPipelineRegistration` templates take an additional + template parameter for an optional `Option` struct definition. This struct + should inherit from `mlir::PassPipelineOptions` and contain the desired pipeline + options. When using `PassPipelineRegistration`, the constructor now takes a + function with the signature `void (OpPassManager &pm, const MyPipelineOptions&)` + which should construct the passes from the options and pass them to the pm: + + ```c++ + struct MyPipelineOptions : public PassPipelineOptions { + // The structure of these options is the same as those for pass options. + Option exampleOption{*this, "flag-name", llvm::cl::desc("...")}; + ListOption exampleListOption{*this, "list-flag-name", + llvm::cl::desc("...")}; + }; + + void registerMyPasses() { + PassPipelineRegistration( + "example-pipeline", "Run an example pipeline.", + [](OpPassManager &pm, const MyPipelineOptions &pipelineOptions) { + // Initialize the pass manager. + }); + } + ``` + + ## Pass Statistics + + Statistics are a way to keep track of what the compiler is doing and how + effective various transformations are. It is often useful to see what effect + specific transformations have on a particular input, and how often they trigger. + Pass statistics are specific to each pass instance, which allow for seeing the + effect of placing a particular transformation at specific places within the pass + pipeline. For example, they help answer questions like "What happens if I run + CSE again here?". + + Statistics can be added to a pass by using the 'Pass::Statistic' class. This + class takes as a constructor arguments: the parent pass, a name, and a + description. This class acts like an atomic unsigned integer, and may be + incremented and updated accordingly. These statistics rely on the same + infrastructure as + [`llvm::Statistic`](http://llvm.org/docs/ProgrammersManual.html#the-statistic-class-stats-option) + and thus have similar usage constraints. Collected statistics can be dumped by + the [pass manager](#pass-manager) programmatically via + `PassManager::enableStatistics`; or via `-pass-statistics` and + `-pass-statistics-display` on the command line. + + An example is shown below: + + ```c++ + struct MyPass ... { + /// Make sure that we have a valid default constructor and copy constructor to + /// ensure that the options are initialized properly. + MyPass() = default; + MyPass(const MyPass& pass) {} + StringRef getArgument() const final { + // This is the argument used to refer to the pass in + // the textual format (on the commandline for example). + return "argument"; + } + StringRef getDescription() const final { + // This is a brief description of the pass. + return "description"; + } + /// Define the statistic to track during the execution of MyPass. + Statistic exampleStat{this, "exampleStat", "An example statistic"}; + + void runOnOperation() { + ... + + // Update the statistic after some invariant was hit. + ++exampleStat; + + ... + } + }; + ``` + + The collected statistics may be aggregated in two types of views: + + A pipeline view that models the structure of the pass manager, this is the + default view: + + ```shell + $ mlir-opt -pass-pipeline='func.func(my-pass,my-pass)' foo.mlir -pass-statistics + + ===-------------------------------------------------------------------------=== + ... Pass statistics report ... + ===-------------------------------------------------------------------------=== + 'func.func' Pipeline + MyPass + (S) 15 exampleStat - An example statistic + VerifierPass + MyPass + (S) 6 exampleStat - An example statistic + VerifierPass + VerifierPass + ``` + + A list view that aggregates the statistics of all instances of a specific pass + together: + + ```shell + $ mlir-opt -pass-pipeline='func.func(my-pass, my-pass)' foo.mlir -pass-statistics -pass-statistics-display=list + + ===-------------------------------------------------------------------------=== + ... Pass statistics report ... + ===-------------------------------------------------------------------------=== + MyPass + (S) 21 exampleStat - An example statistic + ``` + + ## Pass Registration + + Briefly shown in the example definitions of the various pass types is the + `PassRegistration` class. This mechanism allows for registering pass classes so + that they may be created within a + [textual pass pipeline description](#textual-pass-pipeline-specification). An + example registration is shown below: + + ```c++ + void registerMyPass() { + PassRegistration(); + } + ``` + + * `MyPass` is the name of the derived pass class. + * The pass `getArgument()` method is used to get the identifier that will be + used to refer to the pass. + * The pass `getDescription()` method provides a short summary describing the + pass. + + For passes that cannot be default-constructed, `PassRegistration` accepts an + optional argument that takes a callback to create the pass: + + ```c++ + void registerMyPass() { + PassRegistration( + []() -> std::unique_ptr { + std::unique_ptr p = std::make_unique(/*options*/); + /*... non-trivial-logic to configure the pass ...*/; + return p; + }); + } + ``` + + This variant of registration can be used, for example, to accept the + configuration of a pass from command-line arguments and pass it to the pass + constructor. + + Note: Make sure that the pass is copy-constructible in a way that does not share + data as the [pass manager](#pass-manager) may create copies of the pass to run + in parallel. + + ### Pass Pipeline Registration + + Described above is the mechanism used for registering a specific derived pass + class. On top of that, MLIR allows for registering custom pass pipelines in a + similar fashion. This allows for custom pipelines to be available to tools like + mlir-opt in the same way that passes are, which is useful for encapsulating + common pipelines like the "-O1" series of passes. Pipelines are registered via a + similar mechanism to passes in the form of `PassPipelineRegistration`. Compared + to `PassRegistration`, this class takes an additional parameter in the form of a + pipeline builder that modifies a provided `OpPassManager`. + + ```c++ + void pipelineBuilder(OpPassManager &pm) { + pm.addPass(std::make_unique()); + pm.addPass(std::make_unique()); + } + + void registerMyPasses() { + // Register an existing pipeline builder function. + PassPipelineRegistration<>( + "argument", "description", pipelineBuilder); + + // Register an inline pipeline builder. + PassPipelineRegistration<>( + "argument", "description", [](OpPassManager &pm) { + pm.addPass(std::make_unique()); + pm.addPass(std::make_unique()); + }); + } + ``` + + ### Textual Pass Pipeline Specification + + The previous sections detailed how to register passes and pass pipelines with a + specific argument and description. Once registered, these can be used to + configure a pass manager from a string description. This is especially useful + for tools like `mlir-opt`, that configure pass managers from the command line, + or as options to passes that utilize + [dynamic pass pipelines](#dynamic-pass-pipelines). + + To support the ability to describe the full structure of pass pipelines, MLIR + supports a custom textual description of pass pipelines. The textual description + includes the nesting structure, the arguments of the passes and pass pipelines + to run, and any options for those passes and pipelines. A textual pipeline is + defined as a series of names, each of which may in itself recursively contain a + nested pipeline description. The syntax for this specification is as follows: + + ```ebnf + pipeline ::= op-name `(` pipeline-element (`,` pipeline-element)* `)` + pipeline-element ::= pipeline | (pass-name | pass-pipeline-name) options? + options ::= '{' (key ('=' value)?)+ '}' + ``` + + * `op-name` + * This corresponds to the mnemonic name of an operation to run passes on, + e.g. `func.func` or `builtin.module`. + * `pass-name` | `pass-pipeline-name` + * This corresponds to the argument of a registered pass or pass pipeline, + e.g. `cse` or `canonicalize`. + * `options` + * Options are specific key value pairs representing options defined by a + pass or pass pipeline, as described in the + ["Instance Specific Pass Options"](#instance-specific-pass-options) + section. See this section for an example usage in a textual pipeline. + + For example, the following pipeline: + + ```shell + $ mlir-opt foo.mlir -cse -canonicalize -convert-func-to-llvm='use-bare-ptr-memref-call-conv=1' + ``` + + Can also be specified as (via the `-pass-pipeline` flag): + + ```shell + $ mlir-opt foo.mlir -pass-pipeline='func.func(cse,canonicalize),convert-func-to-llvm{use-bare-ptr-memref-call-conv=1}' + ``` + + In order to support round-tripping a pass to the textual representation using + `OpPassManager::printAsTextualPipeline(raw_ostream&)`, override `StringRef + Pass::getArgument()` to specify the argument used when registering a pass. + + ## Declarative Pass Specification + + Some aspects of a Pass may be specified declaratively, in a form similar to + [operations](OpDefinitions.md). This specification simplifies several mechanisms + used when defining passes. It can be used for generating pass registration + calls, defining boilerplate pass utilities, and generating pass documentation. + + Consider the following pass specified in C++: + + ```c++ + struct MyPass : PassWrapper> { + MyPass() = default; + MyPass(const MyPass &) {} + + ... + + // Specify any options. + Option option{ + *this, "example-option", + llvm::cl::desc("An example option"), llvm::cl::init(true)}; + ListOption listOption{ + *this, "example-list", + llvm::cl::desc("An example list option"), llvm::cl::ZeroOrMore}; + + // Specify any statistics. + Statistic statistic{this, "example-statistic", "An example statistic"}; + }; + + /// Expose this pass to the outside world. + std::unique_ptr foo::createMyPass() { + return std::make_unique(); + } + + /// Register this pass. + void foo::registerMyPass() { + PassRegistration(); + } + ``` + + This pass may be specified declaratively as so: + + ```tablegen + def MyPass : Pass<"my-pass", "ModuleOp"> { + let summary = "My Pass Summary"; + let description = [{ + Here we can now give a much larger description of `MyPass`, including all of + its various constraints and behavior. + }]; + + // A constructor must be provided to specify how to create a default instance + // of MyPass. + let constructor = "foo::createMyPass()"; + + // Specify any options. + let options = [ + Option<"option", "example-option", "bool", /*default=*/"true", + "An example option">, + ListOption<"listOption", "example-list", "int64_t", + "An example list option", "llvm::cl::ZeroOrMore"> + ]; + + // Specify any statistics. + let statistics = [ + Statistic<"statistic", "example-statistic", "An example statistic"> + ]; + } + ``` + + Using the `gen-pass-decls` generator, we can generate most of the boilerplate + above automatically. This generator takes as an input a `-name` parameter, that + provides a tag for the group of passes that are being generated. This generator + produces two chunks of output: + + The first is a code block for registering the declarative passes with the global + registry. For each pass, the generator produces a `registerFooPass` where `Foo` + is the name of the definition specified in tablegen. It also generates a + `registerGroupPasses`, where `Group` is the tag provided via the `-name` input + parameter, that registers all of the passes present. + + ```c++ + // gen-pass-decls -name="Example" + + #define GEN_PASS_REGISTRATION + #include "Passes.h.inc" + + void registerMyPasses() { + // Register all of the passes. + registerExamplePasses(); + + // Register `MyPass` specifically. + registerMyPassPass(); + } + ``` + + The second is a base class for each of the passes, containing most of the boiler + plate related to pass definitions. These classes are named in the form of + `MyPassBase`, where `MyPass` is the name of the pass definition in tablegen. We + can update the original C++ pass definition as so: + + ```c++ + /// Include the generated base pass class definitions. + #define GEN_PASS_CLASSES + #include "Passes.h.inc" + + /// Define the main class as deriving from the generated base class. + struct MyPass : MyPassBase { + /// The explicit constructor is no longer explicitly necessary when defining + /// pass options and statistics, the base class takes care of that + /// automatically. + ... + + /// The definitions of the options and statistics are now generated within + /// the base class, but are accessible in the same way. + }; + + /// Expose this pass to the outside world. + std::unique_ptr foo::createMyPass() { + return std::make_unique(); + } + ``` + + Using the `gen-pass-doc` generator, markdown documentation for each of the + passes can be generated. See [Passes.md](Passes.md) for example output of real + MLIR passes. + + ### Tablegen Specification + + The `Pass` class is used to begin a new pass definition. This class takes as an + argument the registry argument to attribute to the pass, as well as an optional + string corresponding to the operation type that the pass operates on. The class + contains the following fields: + + * `summary` +- - A short one line summary of the pass, used as the description when ++ - A short one-line summary of the pass, used as the description when + registering the pass. + * `description` + - A longer, more detailed description of the pass. This is used when + generating pass documentation. + * `dependentDialects` + - A list of strings representing the `Dialect` classes this pass may + introduce entities, Attributes/Operations/Types/etc., of. + * `constructor` + - A code block used to create a default instance of the pass. + * `options` + - A list of pass options used by the pass. + * `statistics` + - A list of pass statistics used by the pass. + + #### Options + + Options may be specified via the `Option` and `ListOption` classes. The `Option` + class takes the following template parameters: + + * C++ variable name + - A name to use for the generated option variable. + * argument + - The argument name of the option. + * type + - The C++ type of the option. + * default value + - The default option value. + * description +- - A one line description of the option. ++ - A one-line description of the option. + * additional option flags + - A string containing any additional options necessary to construct the + option. + + ```tablegen + def MyPass : Pass<"my-pass"> { + let options = [ + Option<"option", "example-option", "bool", /*default=*/"true", + "An example option">, + ]; + } + ``` + + The `ListOption` class takes the following fields: + + * C++ variable name + - A name to use for the generated option variable. + * argument + - The argument name of the option. + * element type + - The C++ type of the list element. + * description +- - A one line description of the option. ++ - A one-line description of the option. + * additional option flags + - A string containing any additional options necessary to construct the + option. + + ```tablegen + def MyPass : Pass<"my-pass"> { + let options = [ + ListOption<"listOption", "example-list", "int64_t", + "An example list option", "llvm::cl::ZeroOrMore"> + ]; + } + ``` + + #### Statistic + + Statistics may be specified via the `Statistic`, which takes the following + template parameters: + + * C++ variable name + - A name to use for the generated statistic variable. + * display name + - The name used when displaying the statistic. + * description +- - A one line description of the statistic. ++ - A one-line description of the statistic. + + ```tablegen + def MyPass : Pass<"my-pass"> { + let statistics = [ + Statistic<"statistic", "example-statistic", "An example statistic"> + ]; + } + ``` + + ## Pass Instrumentation + + MLIR provides a customizable framework to instrument pass execution and analysis + computation, via the `PassInstrumentation` class. This class provides hooks into + the PassManager that observe various events: + + * `runBeforePipeline` + * This callback is run just before a pass pipeline, i.e. pass manager, is + executed. + * `runAfterPipeline` + * This callback is run right after a pass pipeline has been executed, + successfully or not. + * `runBeforePass` + * This callback is run just before a pass is executed. + * `runAfterPass` + * This callback is run right after a pass has been successfully executed. + If this hook is executed, `runAfterPassFailed` will *not* be. + * `runAfterPassFailed` + * This callback is run right after a pass execution fails. If this hook is + executed, `runAfterPass` will *not* be. + * `runBeforeAnalysis` + * This callback is run just before an analysis is computed. + * If the analysis requested another analysis as a dependency, the + `runBeforeAnalysis`/`runAfterAnalysis` pair for the dependency can be + called from inside of the current `runBeforeAnalysis`/`runAfterAnalysis` + pair. + * `runAfterAnalysis` + * This callback is run right after an analysis is computed. + + PassInstrumentation instances may be registered directly with a + [PassManager](#pass-manager) instance via the `addInstrumentation` method. + Instrumentations added to the PassManager are run in a stack like fashion, i.e. + the last instrumentation to execute a `runBefore*` hook will be the first to + execute the respective `runAfter*` hook. The hooks of a `PassInstrumentation` +-class are guaranteed to be executed in a thread safe fashion, so additional ++class are guaranteed to be executed in a thread-safe fashion, so additional + synchronization is not necessary. Below in an example instrumentation that + counts the number of times the `DominanceInfo` analysis is computed: + + ```c++ + struct DominanceCounterInstrumentation : public PassInstrumentation { + /// The cumulative count of how many times dominance has been calculated. + unsigned &count; + + DominanceCounterInstrumentation(unsigned &count) : count(count) {} + void runAfterAnalysis(llvm::StringRef, TypeID id, Operation *) override { + if (id == TypeID::get()) + ++count; + } + }; + + MLIRContext *ctx = ...; + PassManager pm(ctx); + + // Add the instrumentation to the pass manager. + unsigned domInfoCount; + pm.addInstrumentation( + std::make_unique(domInfoCount)); + + // Run the pass manager on a module operation. + ModuleOp m = ...; + if (failed(pm.run(m))) + ... + + llvm::errs() << "DominanceInfo was computed " << domInfoCount << " times!\n"; + ``` + + ### Standard Instrumentations + + MLIR utilizes the pass instrumentation framework to provide a few useful + developer tools and utilities. Each of these instrumentations are directly + available to all users of the MLIR pass framework. + + #### Pass Timing + + The PassTiming instrumentation provides timing information about the execution + of passes and computation of analyses. This provides a quick glimpse into what + passes are taking the most time to execute, as well as how much of an effect a + pass has on the total execution time of the pipeline. Users can enable this + instrumentation directly on the PassManager via `enableTiming`. This + instrumentation is also made available in mlir-opt via the `-mlir-timing` flag. + The PassTiming instrumentation provides several different display modes for the + timing results, each of which is described below: + + ##### List Display Mode + + In this mode, the results are displayed in a list sorted by total time with each + pass/analysis instance aggregated into one unique result. This view is useful + for getting an overview of what analyses/passes are taking the most time in a + pipeline. This display mode is available in mlir-opt via + `-mlir-timing-display=list`. + + ```shell + $ mlir-opt foo.mlir -mlir-disable-threading -pass-pipeline='func.func(cse,canonicalize)' -convert-func-to-llvm -mlir-timing -mlir-timing-display=list + + ===-------------------------------------------------------------------------=== + ... Pass execution timing report ... + ===-------------------------------------------------------------------------=== + Total Execution Time: 0.0203 seconds + + ---Wall Time--- --- Name --- + 0.0047 ( 55.9%) Canonicalizer + 0.0019 ( 22.2%) VerifierPass + 0.0016 ( 18.5%) LLVMLoweringPass + 0.0003 ( 3.4%) CSE + 0.0002 ( 1.9%) (A) DominanceInfo + 0.0084 (100.0%) Total + ``` + + ##### Tree Display Mode + + In this mode, the results are displayed in a nested pipeline view that mirrors + the internal pass pipeline that is being executed in the pass manager. This view + is useful for understanding specifically which parts of the pipeline are taking + the most time, and can also be used to identify when analyses are being + invalidated and recomputed. This is the default display mode. + + ```shell + $ mlir-opt foo.mlir -mlir-disable-threading -pass-pipeline='func.func(cse,canonicalize)' -convert-func-to-llvm -mlir-timing + + ===-------------------------------------------------------------------------=== + ... Pass execution timing report ... + ===-------------------------------------------------------------------------=== + Total Execution Time: 0.0249 seconds + + ---Wall Time--- --- Name --- + 0.0058 ( 70.8%) 'func.func' Pipeline + 0.0004 ( 4.3%) CSE + 0.0002 ( 2.6%) (A) DominanceInfo + 0.0004 ( 4.8%) VerifierPass + 0.0046 ( 55.4%) Canonicalizer + 0.0005 ( 6.2%) VerifierPass + 0.0005 ( 5.8%) VerifierPass + 0.0014 ( 17.2%) LLVMLoweringPass + 0.0005 ( 6.2%) VerifierPass + 0.0082 (100.0%) Total + ``` + + ##### Multi-threaded Pass Timing + + When multi-threading is enabled in the pass manager the meaning of the display + slightly changes. First, a new timing column is added, `User Time`, that + displays the total time spent across all threads. Secondly, the `Wall Time` + column displays the longest individual time spent amongst all of the threads. + This means that the `Wall Time` column will continue to give an indicator on the + perceived time, or clock time, whereas the `User Time` will display the total + cpu time. + + ```shell + $ mlir-opt foo.mlir -pass-pipeline='func.func(cse,canonicalize)' -convert-func-to-llvm -mlir-timing + + ===-------------------------------------------------------------------------=== + ... Pass execution timing report ... + ===-------------------------------------------------------------------------=== + Total Execution Time: 0.0078 seconds + + ---User Time--- ---Wall Time--- --- Name --- + 0.0177 ( 88.5%) 0.0057 ( 71.3%) 'func.func' Pipeline + 0.0044 ( 22.0%) 0.0015 ( 18.9%) CSE + 0.0029 ( 14.5%) 0.0012 ( 15.2%) (A) DominanceInfo + 0.0038 ( 18.9%) 0.0015 ( 18.7%) VerifierPass + 0.0089 ( 44.6%) 0.0025 ( 31.1%) Canonicalizer + 0.0006 ( 3.0%) 0.0002 ( 2.6%) VerifierPass + 0.0004 ( 2.2%) 0.0004 ( 5.4%) VerifierPass + 0.0013 ( 6.5%) 0.0013 ( 16.3%) LLVMLoweringPass + 0.0006 ( 2.8%) 0.0006 ( 7.0%) VerifierPass + 0.0200 (100.0%) 0.0081 (100.0%) Total + ``` + + #### IR Printing + + When debugging it is often useful to dump the IR at various stages of a pass + pipeline. This is where the IR printing instrumentation comes into play. This + instrumentation allows for conditionally printing the IR before and after pass + execution by optionally filtering on the pass being executed. This + instrumentation can be added directly to the PassManager via the + `enableIRPrinting` method. `mlir-opt` provides a few useful flags for utilizing + this instrumentation: + + * `print-ir-before=(comma-separated-pass-list)` + * Print the IR before each of the passes provided within the pass list. + * `print-ir-before-all` + * Print the IR before every pass in the pipeline. + + ```shell + $ mlir-opt foo.mlir -pass-pipeline='func.func(cse)' -print-ir-before=cse + + *** IR Dump Before CSE *** + func @simple_constant() -> (i32, i32) { + %c1_i32 = arith.constant 1 : i32 + %c1_i32_0 = arith.constant 1 : i32 + return %c1_i32, %c1_i32_0 : i32, i32 + } + ``` + + * `print-ir-after=(comma-separated-pass-list)` + * Print the IR after each of the passes provided within the pass list. + * `print-ir-after-all` + * Print the IR after every pass in the pipeline. + + ```shell + $ mlir-opt foo.mlir -pass-pipeline='func.func(cse)' -print-ir-after=cse + + *** IR Dump After CSE *** + func @simple_constant() -> (i32, i32) { + %c1_i32 = arith.constant 1 : i32 + return %c1_i32, %c1_i32 : i32, i32 + } + ``` + + * `print-ir-after-change` + * Only print the IR after a pass if the pass mutated the IR. This helps to + reduce the number of IR dumps for "uninteresting" passes. + * Note: Changes are detected by comparing a hash of the operation before + and after the pass. This adds additional run-time to compute the hash of + the IR, and in some rare cases may result in false-positives depending + on the collision rate of the hash algorithm used. + * Note: This option should be used in unison with one of the other + 'print-ir-after' options above, as this option alone does not enable + printing. + + ```shell + $ mlir-opt foo.mlir -pass-pipeline='func.func(cse,cse)' -print-ir-after=cse -print-ir-after-change + + *** IR Dump After CSE *** + func @simple_constant() -> (i32, i32) { + %c1_i32 = arith.constant 1 : i32 + return %c1_i32, %c1_i32 : i32, i32 + } + ``` + + * `print-ir-after-failure` + * Only print IR after a pass failure. + * This option should *not* be used with the other `print-ir-after` flags + above. + + ```shell + $ mlir-opt foo.mlir -pass-pipeline='func.func(cse,bad-pass)' -print-ir-failure + + *** IR Dump After BadPass Failed *** + func @simple_constant() -> (i32, i32) { + %c1_i32 = arith.constant 1 : i32 + return %c1_i32, %c1_i32 : i32, i32 + } + ``` + + * `print-ir-module-scope` + * Always print the top-level module operation, regardless of pass type or + operation nesting level. + * Note: Printing at module scope should only be used when multi-threading + is disabled(`-mlir-disable-threading`) + + ```shell + $ mlir-opt foo.mlir -mlir-disable-threading -pass-pipeline='func.func(cse)' -print-ir-after=cse -print-ir-module-scope + + *** IR Dump After CSE *** ('func.func' operation: @bar) + func @bar(%arg0: f32, %arg1: f32) -> f32 { + ... + } + + func @simple_constant() -> (i32, i32) { + %c1_i32 = arith.constant 1 : i32 + %c1_i32_0 = arith.constant 1 : i32 + return %c1_i32, %c1_i32_0 : i32, i32 + } + + *** IR Dump After CSE *** ('func.func' operation: @simple_constant) + func @bar(%arg0: f32, %arg1: f32) -> f32 { + ... + } + + func @simple_constant() -> (i32, i32) { + %c1_i32 = arith.constant 1 : i32 + return %c1_i32, %c1_i32 : i32, i32 + } + ``` + + ## Crash and Failure Reproduction + + The [pass manager](#pass-manager) in MLIR contains a builtin mechanism to + generate reproducibles in the event of a crash, or a + [pass failure](#pass-failure). This functionality can be enabled via + `PassManager::enableCrashReproducerGeneration` or via the command line flag + `pass-pipeline-crash-reproducer`. In either case, an argument is provided that + corresponds to the output `.mlir` file name that the reproducible should be + written to. The reproducible contains the configuration of the pass manager that + was executing, as well as the initial IR before any passes were run. A potential + reproducible may have the form: + + ```mlir + // configuration: -pass-pipeline='func.func(cse,canonicalize),inline' -verify-each + + module { + func @foo() { + ... + } + } + ``` + + The configuration dumped can be passed to `mlir-opt` by specifying + `-run-reproducer` flag. This will result in parsing the first line configuration + of the reproducer and adding those to the command line options. + + Beyond specifying a filename, one can also register a `ReproducerStreamFactory` + function that would be invoked in the case of a crash and the reproducer written + to its stream. + + ### Local Reproducer Generation + + An additional flag may be passed to + `PassManager::enableCrashReproducerGeneration`, and specified via + `pass-pipeline-local-reproducer` on the command line, that signals that the pass + manager should attempt to generate a "local" reproducer. This will attempt to + generate a reproducer containing IR right before the pass that fails. This is + useful for situations where the crash is known to be within a specific pass, or + when the original input relies on components (like dialects or passes) that may + not always be available. + + Note: Local reproducer generation requires that multi-threading is + disabled(`-mlir-disable-threading`) + + For example, if the failure in the previous example came from `canonicalize`, + the following reproducer will be generated: + + ```mlir + // configuration: -pass-pipeline='func.func(canonicalize)' -verify-each -mlir-disable-threading + + module { + func @foo() { + ... + } + } + ``` diff --git a/mlir/docs/PatternRewriter.md b/mlir/docs/PatternRewriter.md --- a/mlir/docs/PatternRewriter.md +++ b/mlir/docs/PatternRewriter.md @@ -232,7 +232,7 @@ ## Pattern Application After a set of patterns have been defined, they are collected and provided to a -specific driver for application. A driver consists of several high levels parts: +specific driver for application. A driver consists of several high level parts: * Input `RewritePatternSet` diff --git a/mlir/docs/PatternRewriter.md.rej b/mlir/docs/PatternRewriter.md.rej new file mode 100644 --- /dev/null +++ b/mlir/docs/PatternRewriter.md.rej @@ -0,0 +1,464 @@ +diff a/mlir/docs/PatternRewriter.md b/mlir/docs/PatternRewriter.md (rejected hunks) +@@ -1,461 +1,461 @@ + # Pattern Rewriting : Generic DAG-to-DAG Rewriting + + [TOC] + + This document details the design and API of the pattern rewriting infrastructure + present in MLIR, a general DAG-to-DAG transformation framework. This framework + is widely used throughout MLIR for canonicalization, conversion, and general + transformation. + + For an introduction to DAG-to-DAG transformation, and the rationale behind this + framework please take a look at the + [Generic DAG Rewriter Rationale](Rationale/RationaleGenericDAGRewriter.md). + + ## Introduction + + The pattern rewriting framework can largely be decomposed into two parts: + Pattern Definition and Pattern Application. + + ## Defining Patterns + + Patterns are defined by inheriting from the `RewritePattern` class. This class + represents the base class of all rewrite patterns within MLIR, and is comprised + of the following components: + + ### Benefit + + This is the expected benefit of applying a given pattern. This benefit is static + upon construction of the pattern, but may be computed dynamically at pattern + initialization time, e.g. allowing the benefit to be derived from domain + specific information (like the target architecture). This limitation allows for + performing pattern fusion and compiling patterns into an efficient state + machine, and + [Thier, Ertl, and Krall](https://dl.acm.org/citation.cfm?id=3179501) have shown + that match predicates eliminate the need for dynamically computed costs in + almost all cases: you can simply instantiate the same pattern one time for each + possible cost and use the predicate to guard the match. + + ### Root Operation Name (Optional) + + The name of the root operation that this pattern matches against. If specified, + only operations with the given root name will be provided to the `match` and + `rewrite` implementation. If not specified, any operation type may be provided. + The root operation name should be provided whenever possible, because it + simplifies the analysis of patterns when applying a cost model. To match any + operation type, a special tag must be provided to make the intent explicit: + `MatchAnyOpTypeTag`. + + ### `match` and `rewrite` implementation + + This is the chunk of code that matches a given root `Operation` and performs a + rewrite of the IR. A `RewritePattern` can specify this implementation either via + separate `match` and `rewrite` methods, or via a combined `matchAndRewrite` + method. When using the combined `matchAndRewrite` method, no IR mutation should + take place before the match is deemed successful. The combined `matchAndRewrite` + is useful when non-trivially recomputable information is required by the + matching and rewriting phase. See below for examples: + + ```c++ + class MyPattern : public RewritePattern { + public: + /// This overload constructs a pattern that only matches operations with the + /// root name of `MyOp`. + MyPattern(PatternBenefit benefit, MLIRContext *context) + : RewritePattern(MyOp::getOperationName(), benefit, context) {} + /// This overload constructs a pattern that matches any operation type. + MyPattern(PatternBenefit benefit) + : RewritePattern(benefit, MatchAnyOpTypeTag()) {} + + /// In this section, the `match` and `rewrite` implementation is specified + /// using the separate hooks. + LogicalResult match(Operation *op) const override { + // The `match` method returns `success()` if the pattern is a match, failure + // otherwise. + // ... + } + void rewrite(Operation *op, PatternRewriter &rewriter) { + // The `rewrite` method performs mutations on the IR rooted at `op` using + // the provided rewriter. All mutations must go through the provided + // rewriter. + } + + /// In this section, the `match` and `rewrite` implementation is specified + /// using a single hook. + LogicalResult matchAndRewrite(Operation *op, PatternRewriter &rewriter) { + // The `matchAndRewrite` method performs both the matching and the mutation. + // Note that the match must reach a successful point before IR mutation may + // take place. + } + }; + ``` + + #### Restrictions + + Within the `match` section of a pattern, the following constraints apply: + + * No mutation of the IR is allowed. + + Within the `rewrite` section of a pattern, the following constraints apply: + + * All IR mutations, including creation, *must* be performed by the given + `PatternRewriter`. This class provides hooks for performing all of the + possible mutations that may take place within a pattern. For example, this + means that an operation should not be erased via its `erase` method. To + erase an operation, the appropriate `PatternRewriter` hook (in this case + `eraseOp`) should be used instead. + * The root operation is required to either be: updated in-place, replaced, or + erased. + + ### Application Recursion + + Recursion is an important topic in the context of pattern rewrites, as a pattern + may often be applicable to its own result. For example, imagine a pattern that + peels a single iteration from a loop operation. If the loop has multiple + peelable iterations, this pattern may apply multiple times during the + application process. By looking at the implementation of this pattern, the bound + for recursive application may be obvious, e.g. there are no peelable iterations + within the loop, but from the perspective of the pattern driver this recursion + is potentially dangerous. Often times the recursive application of a pattern + indicates a bug in the matching logic. These types of bugs generally do not + cause crashes, but create infinite loops within the application process. Given + this, the pattern rewriting infrastructure conservatively assumes that no + patterns have a proper bounded recursion, and will fail if recursion is + detected. A pattern that is known to have proper support for handling recursion + can signal this by calling `setHasBoundedRewriteRecursion` when initializing the + pattern. This will signal to the pattern driver that recursive application of + this pattern may happen, and the pattern is equipped to safely handle it. + + ### Debug Names and Labels + + To aid in debugging, patterns may specify: a debug name (via `setDebugName`), + which should correspond to an identifier that uniquely identifies the specific + pattern; and a set of debug labels (via `addDebugLabels`), which correspond to + identifiers that uniquely identify groups of patterns. This information is used + by various utilities to aid in the debugging of pattern rewrites, e.g. in debug + logs, to provide pattern filtering, etc. A simple code example is shown below: + + ```c++ + class MyPattern : public RewritePattern { + public: + /// Inherit constructors from RewritePattern. + using RewritePattern::RewritePattern; + + void initialize() { + setDebugName("MyPattern"); + addDebugLabels("MyRewritePass"); + } + + // ... + }; + + void populateMyPatterns(RewritePatternSet &patterns, MLIRContext *ctx) { + // Debug labels may also be attached to patterns during insertion. This allows + // for easily attaching common labels to groups of patterns. + patterns.addWithLabel("MyRewritePatterns", ctx); + } + ``` + + ### Initialization + + Several pieces of pattern state require explicit initialization by the pattern, + for example setting `setHasBoundedRewriteRecursion` if a pattern safely handles + recursive application. This pattern state can be initialized either in the + constructor of the pattern or via the utility `initialize` hook. Using the + `initialize` hook removes the need to redefine pattern constructors just to + inject additional pattern state initialization. An example is shown below: + + ```c++ + class MyPattern : public RewritePattern { + public: + /// Inherit the constructors from RewritePattern. + using RewritePattern::RewritePattern; + + /// Initialize the pattern. + void initialize() { + /// Signal that this pattern safely handles recursive application. + setHasBoundedRewriteRecursion(); + } + + // ... + }; + ``` + + ### Construction + + Constructing a RewritePattern should be performed by using the static + `RewritePattern::create` utility method. This method ensures that the pattern + is properly initialized and prepared for insertion into a `RewritePatternSet`. + + ## Pattern Rewriter + + A `PatternRewriter` is a special class that allows for a pattern to communicate + with the driver of pattern application. As noted above, *all* IR mutations, + including creations, are required to be performed via the `PatternRewriter` + class. This is required because the underlying pattern driver may have state + that would be invalidated when a mutation takes place. Examples of some of the + more prevalent `PatternRewriter` API is shown below, please refer to the + [class documentation](https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/PatternMatch.h#L235) + for a more up-to-date listing of the available API: + + * Erase an Operation : `eraseOp` + + This method erases an operation that either has no results, or whose results are + all known to have no uses. + + * Notify why a `match` failed : `notifyMatchFailure` + + This method allows for providing a diagnostic message within a `matchAndRewrite` + as to why a pattern failed to match. How this message is displayed back to the + user is determined by the specific pattern driver. + + * Replace an Operation : `replaceOp`/`replaceOpWithNewOp` + + This method replaces an operation's results with a set of provided values, and + erases the operation. + + * Update an Operation in-place : `(start|cancel|finalize)RootUpdate` + + This is a collection of methods that provide a transaction-like API for updating + the attributes, location, operands, or successors of an operation in-place + within a pattern. An in-place update transaction is started with + `startRootUpdate`, and may either be canceled or finalized with + `cancelRootUpdate` and `finalizeRootUpdate` respectively. A convenience wrapper, + `updateRootInPlace`, is provided that wraps a `start` and `finalize` around a + callback. + + * OpBuilder API + + The `PatternRewriter` inherits from the `OpBuilder` class, and thus provides all + of the same functionality present within an `OpBuilder`. This includes operation + creation, as well as many useful attribute and type construction methods. + + ## Pattern Application + + After a set of patterns have been defined, they are collected and provided to a +-specific driver for application. A driver consists of several high levels parts: ++specific driver for application. A driver consists of several high level parts: + + * Input `RewritePatternSet` + + The input patterns to a driver are provided in the form of an + `RewritePatternSet`. This class provides a simplified API for building a + list of patterns. + + * Driver-specific `PatternRewriter` + + To ensure that the driver state does not become invalidated by IR mutations + within the pattern rewriters, a driver must provide a `PatternRewriter` instance + with the necessary hooks overridden. If a driver does not need to hook into + certain mutations, a default implementation is provided that will perform the + mutation directly. + + * Pattern Application and Cost Model + + Each driver is responsible for defining its own operation visitation order as + well as pattern cost model, but the final application is performed via a + `PatternApplicator` class. This class takes as input the + `RewritePatternSet` and transforms the patterns based upon a provided + cost model. This cost model computes a final benefit for a given pattern, using + whatever driver specific information necessary. After a cost model has been + computed, the driver may begin to match patterns against operations using + `PatternApplicator::matchAndRewrite`. + + An example is shown below: + + ```c++ + class MyPattern : public RewritePattern { + public: + MyPattern(PatternBenefit benefit, MLIRContext *context) + : RewritePattern(MyOp::getOperationName(), benefit, context) {} + }; + + /// Populate the pattern list. + void collectMyPatterns(RewritePatternSet &patterns, MLIRContext *ctx) { + patterns.add(/*benefit=*/1, ctx); + } + + /// Define a custom PatternRewriter for use by the driver. + class MyPatternRewriter : public PatternRewriter { + public: + MyPatternRewriter(MLIRContext *ctx) : PatternRewriter(ctx) {} + + /// Override the necessary PatternRewriter hooks here. + }; + + /// Apply the custom driver to `op`. + void applyMyPatternDriver(Operation *op, + const RewritePatternSet &patterns) { + // Initialize the custom PatternRewriter. + MyPatternRewriter rewriter(op->getContext()); + + // Create the applicator and apply our cost model. + PatternApplicator applicator(patterns); + applicator.applyCostModel([](const Pattern &pattern) { + // Apply a default cost model. + // Note: This is just for demonstration, if the default cost model is truly + // desired `applicator.applyDefaultCostModel()` should be used + // instead. + return pattern.getBenefit(); + }); + + // Try to match and apply a pattern. + LogicalResult result = applicator.matchAndRewrite(op, rewriter); + if (failed(result)) { + // ... No patterns were applied. + } + // ... A pattern was successfully applied. + } + ``` + + ## Common Pattern Drivers + + MLIR provides several common pattern drivers that serve a variety of different + use cases. + + ### Dialect Conversion Driver + + This driver provides a framework in which to perform operation conversions + between, and within dialects using a concept of "legality". This framework + allows for transforming illegal operations to those supported by a provided + conversion target, via a set of pattern-based operation rewriting patterns. This + framework also provides support for type conversions. More information on this + driver can be found [here](DialectConversion.md). + + ### Greedy Pattern Rewrite Driver + + This driver walks the provided operations and greedily applies the patterns that + locally have the most benefit. The benefit of + a pattern is decided solely by the benefit specified on the pattern, and the + relative order of the pattern within the pattern list (when two patterns have + the same local benefit). Patterns are iteratively applied to operations until a + fixed point is reached, at which point the driver finishes. This driver may be + used via the following: `applyPatternsAndFoldGreedily` and + `applyOpPatternsAndFold`. The latter of which only applies patterns to the + provided operation, and will not traverse the IR. + + The driver is configurable and supports two modes: 1) you may opt-in to a + "top-down" traversal, which seeds the worklist with each operation top down and + in a pre-order over the region tree. This is generally more efficient in + compile time. 2) the default is a "bottom up" traversal, which builds the + initial worklist with a postorder traversal of the region tree. This may + match larger patterns with ambiguous pattern sets. + + Note: This driver is the one used by the [canonicalization](Canonicalization.md) + [pass](Passes.md/#-canonicalize-canonicalize-operations) in MLIR. + + ### Debugging + + To debug the execution of the greedy pattern rewrite driver, + `-debug-only=greedy-rewriter` may be used. This command line flag activates + LLVM's debug logging infrastructure solely for the greedy pattern rewriter. The + output is formatted as a tree structure, mirroring the structure of the pattern + application process. This output contains all of the actions performed by the + rewriter, how operations get processed and patterns are applied, and why they + fail. + + Example output is shown below: + + ``` + //===-------------------------------------------===// + Processing operation : 'cf.cond_br'(0x60f000001120) { + "cf.cond_br"(%arg0)[^bb2, ^bb2] {operand_segment_sizes = dense<[1, 0, 0]> : vector<3xi32>} : (i1) -> () + + * Pattern SimplifyConstCondBranchPred : 'cf.cond_br -> ()' { + } -> failure : pattern failed to match + + * Pattern SimplifyCondBranchIdenticalSuccessors : 'cf.cond_br -> ()' { + ** Insert : 'cf.br'(0x60b000003690) + ** Replace : 'cf.cond_br'(0x60f000001120) + } -> success : pattern applied successfully + } -> success : pattern matched + //===-------------------------------------------===// + ``` + + This output is describing the processing of a `cf.cond_br` operation. We first + try to apply the `SimplifyConstCondBranchPred`, which fails. From there, another + pattern (`SimplifyCondBranchIdenticalSuccessors`) is applied that matches the + `cf.cond_br` and replaces it with a `cf.br`. + + ## Debugging + + ### Pattern Filtering + + To simplify test case definition and reduction, the `FrozenRewritePatternSet` + class provides built-in support for filtering which patterns should be provided + to the pattern driver for application. Filtering behavior is specified by + providing a `disabledPatterns` and `enabledPatterns` list when constructing the + `FrozenRewritePatternSet`. The `disabledPatterns` list should contain a set of + debug names or labels for patterns that are disabled during pattern application, + i.e. which patterns should be filtered out. The `enabledPatterns` list should + contain a set of debug names or labels for patterns that are enabled during + pattern application, patterns that do not satisfy this constraint are filtered + out. Note that patterns specified by the `disabledPatterns` list will be + filtered out even if they match criteria in the `enabledPatterns` list. An + example is shown below: + + ```c++ + void MyPass::initialize(MLIRContext *context) { + // No patterns are explicitly disabled. + SmallVector disabledPatterns; + // Enable only patterns with a debug name or label of `MyRewritePatterns`. + SmallVector enabledPatterns(1, "MyRewritePatterns"); + + RewritePatternSet rewritePatterns(context); + // ... + frozenPatterns = FrozenRewritePatternSet(rewritePatterns, disabledPatterns, + enabledPatterns); + } + ``` + + ### Common Pass Utilities + + Passes that utilize rewrite patterns should aim to provide a common set of + options and toggles to simplify the debugging experience when switching between + different passes/projects/etc. To aid in this endeavor, MLIR provides a common + set of utilities that can be easily included when defining a custom pass. These + are defined in `mlir/RewritePassUtil.td`; an example usage is shown below: + + ```tablegen + def MyRewritePass : Pass<"..."> { + let summary = "..."; + let constructor = "createMyRewritePass()"; + + // Inherit the common pattern rewrite options from `RewritePassUtils`. + let options = RewritePassUtils.options; + } + ``` + + #### Rewrite Pass Options + + This section documents common pass options that are useful for controlling the + behavior of rewrite pattern application. + + ##### Pattern Filtering + + Two common pattern filtering options are exposed, `disable-patterns` and + `enable-patterns`, matching the behavior of the `disabledPatterns` and + `enabledPatterns` lists described in the [Pattern Filtering](#pattern-filtering) + section above. A snippet of the tablegen definition of these options is shown + below: + + ```tablegen + ListOption<"disabledPatterns", "disable-patterns", "std::string", + "Labels of patterns that should be filtered out during application">, + ListOption<"enabledPatterns", "enable-patterns", "std::string", + "Labels of patterns that should be used during application, all " + "other patterns are filtered out">, + ``` + + These options may be used to provide filtering behavior when constructing any + `FrozenRewritePatternSet`s within the pass: + + ```c++ + void MyRewritePass::initialize(MLIRContext *context) { + RewritePatternSet rewritePatterns(context); + // ... + + // When constructing the `FrozenRewritePatternSet`, we provide the filter + // list options. + frozenPatterns = FrozenRewritePatternSet(rewritePatterns, disabledPatterns, + enabledPatterns); + } + ``` diff --git a/mlir/docs/README.txt b/mlir/docs/README.txt --- a/mlir/docs/README.txt +++ b/mlir/docs/README.txt @@ -1,9 +1,9 @@ -MLIR documentation -================== - -Please note mlir.llvm.org is where MLIR's rendered documentation is displayed. -The viewing experience on GitHub or elsewhere may not match those of the -website. For any changes please verify instead that they work on the main -website first. - -See https://github.com/llvm/mlir-www for the website generation information. +MLIR documentation +================== + +Please note mlir.llvm.org is where MLIR's rendered documentation is displayed. +The viewing experience on GitHub or elsewhere may not match those of the +website. For any changes please verify instead that they work on the main +website first. + +See https://github.com/llvm/mlir-www for the website generation information. \ No newline at end of file