diff --git a/mlir/docs/Diagnostics.md b/mlir/docs/Diagnostics.md --- a/mlir/docs/Diagnostics.md +++ b/mlir/docs/Diagnostics.md @@ -11,69 +11,9 @@ ## Source Locations Source location information is extremely important for any compiler, because it -provides a baseline for debuggability and error-reporting. MLIR provides several -different location types depending on the situational need. - -### CallSite Location - -``` -callsite-location ::= 'callsite' '(' location 'at' location ')' -``` - -An instance of this location allows for representing a directed stack of -location usages. This connects a location of a `callee` with the location of a -`caller`. - -### FileLineCol Location - -``` -filelinecol-location ::= string-literal ':' integer-literal ':' integer-literal -``` - -An instance of this location represents a tuple of file, line number, and column -number. This is similar to the type of location that you get from most source -languages. - -### Fused Location - -``` -fused-location ::= `fused` fusion-metadata? '[' location (location ',')* ']' -fusion-metadata ::= '<' attribute-value '>' -``` - -An instance of a `fused` location represents a grouping of several other source -locations, with optional metadata that describes the context of the fusion. -There are many places within a compiler in which several constructs may be fused -together, e.g. pattern rewriting, that normally result partial or even total -loss of location information. With `fused` locations, this is a non-issue. - -### Name Location - -``` -name-location ::= string-literal ('(' location ')')? -``` - -An instance of this location allows for attaching a name to a child location. -This can be useful for representing the locations of variable, or node, -definitions. - -### Opaque Location - -An instance of this location essentially contains a pointer to some data -structure that is external to MLIR and an optional location that can be used if -the first one is not suitable. Since it contains an external structure, only the -optional location is used during serialization. - -### Unknown Location - -``` -unknown-location ::= `unknown` -``` - -Source location information is an extremely integral part of the MLIR -infrastructure. As such, location information is always present in the IR, and -must explicitly be set to unknown. Thus an instance of the `unknown` location, -represents an unspecified source location. +provides a baseline for debuggability and error-reporting. The +[builtin dialect](Dialects/Builtin.md) provides several different location +attributes types depending on the situational need. ## Diagnostic Engine diff --git a/mlir/docs/Dialects/Builtin.md b/mlir/docs/Dialects/Builtin.md new file mode 100644 --- /dev/null +++ b/mlir/docs/Dialects/Builtin.md @@ -0,0 +1,32 @@ +# Builtin Dialect + +The builtin dialect contains a core set of Attributes, Operations, and Types +that have wide applicability across a very large number of domains and +abstractions. Many of the components of this dialect are also instrumental in +the implementation of the core IR. As such, this dialect is implicitly loaded in +every `MLIRContext`, and available directly to all users of MLIR. + +Given the far-reaching nature of this dialect and the fact that MLIR is +extensible by design, any potential additions are heavily scrutinized. + +[TOC] + +## Attributes + +[include "Dialects/BuiltinAttributes.md"] + +## Location Attributes + +A subset of the builtin attribute values correspond to +[source locations](../Diagnostics.md#source-locations), that may be attached to +Operations. + +[include "Dialects/BuiltinLocationAttributes.md"] + +## Operations + +[include "Dialects/BuiltinOps.md"] + +## Types + +[include "Dialects/BuiltinTypes.md"] diff --git a/mlir/docs/LangRef.md b/mlir/docs/LangRef.md --- a/mlir/docs/LangRef.md +++ b/mlir/docs/LangRef.md @@ -60,14 +60,13 @@ One obvious application of MLIR is to represent an [SSA-based](https://en.wikipedia.org/wiki/Static_single_assignment_form) IR, -like the LLVM core IR, with appropriate choice of Operation Types to define -[Modules](#module), [Functions](#functions), Branches, Allocations, and -verification constraints to ensure the SSA Dominance property. MLIR includes a -'standard' dialect which defines just such structures. However, MLIR is -intended to be general enough to represent other compiler-like data -structures, such as Abstract Syntax Trees in a language frontend, generated -instructions in a target-specific backend, or circuits in a High-Level -Synthesis tool. +like the LLVM core IR, with appropriate choice of operation types to define +Modules, Functions, Branches, Memory Allocation, and verification constraints to +ensure the SSA Dominance property. MLIR includes a collection of dialects which +defines just such structures. However, MLIR is intended to be general enough to +represent other compiler-like data structures, such as Abstract Syntax Trees in +a language frontend, generated instructions in a target-specific backend, or +circuits in a High-Level Synthesis tool. Here's an example of an MLIR module: @@ -328,96 +327,12 @@ This allows those dialects to support _custom assembly form_ for parsing and printing operations. In the operation sets listed below, we show both forms. -### Terminator Operations +### Builtin Operations -These are a special category of operations that *must* terminate a block, e.g. -[branches](Dialects/Standard.md#terminator-operations). These operations may -also have a list of successors ([blocks](#blocks) and their arguments). - -Example: - -```mlir -// Branch to ^bb1 or ^bb2 depending on the condition %cond. -// Pass value %v to ^bb2, but not to ^bb1. -"cond_br"(%cond)[^bb1, ^bb2(%v : index)] : (i1) -> () -``` - -### Module - -``` -module ::= `module` symbol-ref-id? (`attributes` dictionary-attribute)? region -``` - -An MLIR Module represents a top-level container operation. It contains a single -[SSACFG region](#control-flow-and-ssacfg-regions) containing a single block -which can contain any operations. Operations within this region cannot -implicitly capture values defined outside the module, i.e. Modules are -[IsolatedFromAbove](Traits.md#isolatedfromabove). Modules have an optional -[symbol name](SymbolsAndSymbolTables.md) which can be used to refer to them in -operations. - -### Functions - -An MLIR Function is an operation with a name containing a single [SSACFG -region](#control-flow-and-ssacfg-regions). Operations within this region -cannot implicitly capture values defined outside of the function, -i.e. Functions are [IsolatedFromAbove](Traits.md#isolatedfromabove). All -external references must use function arguments or attributes that establish a -symbolic connection (e.g. symbols referenced by name via a string attribute -like [SymbolRefAttr](#symbol-reference-attribute)): - -``` -function ::= `func` function-signature function-attributes? function-body? - -function-signature ::= symbol-ref-id `(` argument-list `)` - (`->` function-result-list)? - -argument-list ::= (named-argument (`,` named-argument)*) | /*empty*/ -argument-list ::= (type dictionary-attribute? (`,` type dictionary-attribute?)*) - | /*empty*/ -named-argument ::= value-id `:` type dictionary-attribute? - -function-result-list ::= function-result-list-parens - | non-function-type -function-result-list-parens ::= `(` `)` - | `(` function-result-list-no-parens `)` -function-result-list-no-parens ::= function-result (`,` function-result)* -function-result ::= type dictionary-attribute? - -function-attributes ::= `attributes` dictionary-attribute -function-body ::= region -``` - -An external function declaration (used when referring to a function declared -in some other module) has no body. While the MLIR textual form provides a nice -inline syntax for function arguments, they are internally represented as -"block arguments" to the first block in the region. - -Only dialect attribute names may be specified in the attribute dictionaries -for function arguments, results, or the function itself. - -Examples: - -```mlir -// External function definitions. -func @abort() -func @scribble(i32, i64, memref) -> f64 - -// A function that returns its argument twice: -func @count(%x: i64) -> (i64, i64) - attributes {fruit: "banana"} { - return %x, %x: i64, i64 -} - -// A function with an argument attribute -func @example_fn_arg(%x: i32 {swift.self = unit}) - -// A function with a result attribute -func @example_fn_result() -> (f64 {dialectName.attrName = 0 : i64}) - -// A function with an attribute -func @example_fn_attr() attributes {dialectName.attrName = false} -``` +The [builtin dialect](Dialects/Builtin.md) defines a select few operations that +are widely applicable by MLIR dialects, such as a universal conversion cast +operation that simplifies inter/intra dialect conversion. This dialect also +defines a top-level `module` operation, that represents a useful IR container. ## Blocks @@ -701,14 +616,10 @@ ## Type System -Each value in MLIR has a type defined by the type system below. There are a -number of primitive types (like integers) and also aggregate types for tensors -and memory buffers. MLIR [builtin types](#builtin-types) do not include -structures, arrays, or dictionaries. - -MLIR has an open type system (i.e. there is no fixed list of types), and types -may have application-specific semantics. For example, MLIR supports a set of -[dialect types](#dialect-types). +Each value in MLIR has a type defined by the type system. MLIR has an open type +system (i.e. there is no fixed list of types), and types may have +application-specific semantics. MLIR dialects may define any number of types +with no restrictions on the abstractions they represent. ``` type ::= type-alias | dialect-type | builtin-type @@ -806,497 +717,14 @@ that are not allowed in the lighter syntax, as well as unbalanced `<>` characters. -See [here](Tutorials/DefiningAttributesAndTypes.md) to learn how to define dialect types. +See [here](Tutorials/DefiningAttributesAndTypes.md) to learn how to define +dialect types. ### Builtin Types -Builtin types are a core set of [dialect types](#dialect-types) that are defined -in a builtin dialect and thus available to all users of MLIR. - -``` -builtin-type ::= complex-type - | float-type - | function-type - | index-type - | integer-type - | memref-type - | none-type - | tensor-type - | tuple-type - | vector-type -``` - -#### Complex Type - -Syntax: - -``` -complex-type ::= `complex` `<` type `>` -``` - -The value of `complex` type represents a complex number with a parameterized -element type, which is composed of a real and imaginary value of that element -type. The element must be a floating point or integer scalar type. - -Examples: - -```mlir -complex -complex -``` - -#### Floating Point Types - -Syntax: - -``` -// Floating point. -float-type ::= `f16` | `bf16` | `f32` | `f64` | `f80` | `f128` -``` - -MLIR supports float types of certain widths that are widely used as indicated -above. - -#### Function Type - -Syntax: - -``` -// MLIR functions can return multiple values. -function-result-type ::= type-list-parens - | non-function-type - -function-type ::= type-list-parens `->` function-result-type -``` - -MLIR supports first-class functions: for example, the -[`constant` operation](Dialects/Standard.md#stdconstant-constantop) produces the -address of a function as a value. This value may be passed to and -returned from functions, merged across control flow boundaries with -[block arguments](#blocks), and called with the -[`call_indirect` operation](Dialects/Standard.md#call-indirect-operation). - -Function types are also used to indicate the arguments and results of -[operations](#operations). - -#### Index Type - -Syntax: - -``` -// Target word-sized integer. -index-type ::= `index` -``` - -The `index` type is a signless integer whose size is equal to the natural -machine word of the target -([rationale](Rationale/Rationale.md#integer-signedness-semantics)) and is used -by the affine constructs in MLIR. Unlike fixed-size integers, it cannot be used -as an element of vector -([rationale](Rationale/Rationale.md#index-type-disallowed-in-vector-types)). - -**Rationale:** integers of platform-specific bit widths are practical to express -sizes, dimensionalities and subscripts. - -#### Integer Type - -Syntax: - -``` -// Sized integers like i1, i4, i8, i16, i32. -signed-integer-type ::= `si` [1-9][0-9]* -unsigned-integer-type ::= `ui` [1-9][0-9]* -signless-integer-type ::= `i` [1-9][0-9]* -integer-type ::= signed-integer-type | - unsigned-integer-type | - signless-integer-type -``` - -MLIR supports arbitrary precision integer types. Integer types have a designated -width and may have signedness semantics. - -**Rationale:** low precision integers (like `i2`, `i4` etc) are useful for -low-precision inference chips, and arbitrary precision integers are useful for -hardware synthesis (where a 13 bit multiplier is a lot cheaper/smaller than a 16 -bit one). - -TODO: Need to decide on a representation for quantized integers -([initial thoughts](Rationale/Rationale.md#quantized-integer-operations)). - -#### Memref Type - -Syntax: - -``` -memref-type ::= ranked-memref-type | unranked-memref-type - -ranked-memref-type ::= `memref` `<` dimension-list-ranked type - (`,` layout-specification)? (`,` memory-space)? `>` - -unranked-memref-type ::= `memref` `<*x` type (`,` memory-space)? `>` - -stride-list ::= `[` (dimension (`,` dimension)*)? `]` -strided-layout ::= `offset:` dimension `,` `strides: ` stride-list -semi-affine-map-composition ::= (semi-affine-map `,` )* semi-affine-map -layout-specification ::= semi-affine-map-composition | strided-layout -memory-space ::= integer-literal /* | TODO: address-space-id */ -``` - -A `memref` type is a reference to a region of memory (similar to a buffer -pointer, but more powerful). The buffer pointed to by a memref can be allocated, -aliased and deallocated. A memref can be used to read and write data from/to the -memory region which it references. Memref types use the same shape specifier as -tensor types. Note that `memref`, `memref<0 x f32>`, `memref<1 x 0 x f32>`, -and `memref<0 x 1 x f32>` are all different types. - -A `memref` is allowed to have an unknown rank (e.g. `memref<*xf32>`). The -purpose of unranked memrefs is to allow external library functions to receive -memref arguments of any rank without versioning the functions based on the rank. -Other uses of this type are disallowed or will have undefined behavior. - -##### Codegen of Unranked Memref - -Using unranked memref in codegen besides the case mentioned above is highly -discouraged. Codegen is concerned with generating loop nests and specialized -instructions for high-performance, unranked memref is concerned with hiding the -rank and thus, the number of enclosing loops required to iterate over the data. -However, if there is a need to code-gen unranked memref, one possible path is to -cast into a static ranked type based on the dynamic rank. Another possible path -is to emit a single while loop conditioned on a linear index and perform -delinearization of the linear index to a dynamic array containing the (unranked) -indices. While this is possible, it is expected to not be a good idea to perform -this during codegen as the cost of the translations is expected to be -prohibitive and optimizations at this level are not expected to be worthwhile. -If expressiveness is the main concern, irrespective of performance, passing -unranked memrefs to an external C++ library and implementing rank-agnostic logic -there is expected to be significantly simpler. - -Unranked memrefs may provide expressiveness gains in the future and help bridge -the gap with unranked tensors. Unranked memrefs will not be expected to be -exposed to codegen but one may query the rank of an unranked memref (a special -op will be needed for this purpose) and perform a switch and cast to a ranked -memref as a prerequisite to codegen. - -Example: - -```mlir -// With static ranks, we need a function for each possible argument type -%A = alloc() : memref<16x32xf32> -%B = alloc() : memref<16x32x64xf32> -call @helper_2D(%A) : (memref<16x32xf32>)->() -call @helper_3D(%B) : (memref<16x32x64xf32>)->() - -// With unknown rank, the functions can be unified under one unranked type -%A = alloc() : memref<16x32xf32> -%B = alloc() : memref<16x32x64xf32> -// Remove rank info -%A_u = memref_cast %A : memref<16x32xf32> -> memref<*xf32> -%B_u = memref_cast %B : memref<16x32x64xf32> -> memref<*xf32> -// call same function with dynamic ranks -call @helper(%A_u) : (memref<*xf32>)->() -call @helper(%B_u) : (memref<*xf32>)->() -``` - -The core syntax and representation of a layout specification is a -[semi-affine map](Dialects/Affine.md#semi-affine-maps). Additionally, syntactic -sugar is supported to make certain layout specifications more intuitive to read. -For the moment, a `memref` supports parsing a strided form which is converted to -a semi-affine map automatically. - -The memory space of a memref is specified by a target-specific attribute. -It might be an integer value, string, dictionary or custom dialect attribute. -The empty memory space (attribute is None) is target specific. - -The notionally dynamic value of a memref value includes the address of the -buffer allocated, as well as the symbols referred to by the shape, layout map, -and index maps. - -Examples of memref static type - -```mlir -// Identity index/layout map -#identity = affine_map<(d0, d1) -> (d0, d1)> - -// Column major layout. -#col_major = affine_map<(d0, d1, d2) -> (d2, d1, d0)> - -// A 2-d tiled layout with tiles of size 128 x 256. -#tiled_2d_128x256 = affine_map<(d0, d1) -> (d0 div 128, d1 div 256, d0 mod 128, d1 mod 256)> - -// A tiled data layout with non-constant tile sizes. -#tiled_dynamic = affine_map<(d0, d1)[s0, s1] -> (d0 floordiv s0, d1 floordiv s1, - d0 mod s0, d1 mod s1)> - -// A layout that yields a padding on two at either end of the minor dimension. -#padded = affine_map<(d0, d1) -> (d0, (d1 + 2) floordiv 2, (d1 + 2) mod 2)> - - -// The dimension list "16x32" defines the following 2D index space: -// -// { (i, j) : 0 <= i < 16, 0 <= j < 32 } -// -memref<16x32xf32, #identity> - -// The dimension list "16x4x?" defines the following 3D index space: -// -// { (i, j, k) : 0 <= i < 16, 0 <= j < 4, 0 <= k < N } -// -// where N is a symbol which represents the runtime value of the size of -// the third dimension. -// -// %N here binds to the size of the third dimension. -%A = alloc(%N) : memref<16x4x?xf32, #col_major> - -// A 2-d dynamic shaped memref that also has a dynamically sized tiled layout. -// The memref index space is of size %M x %N, while %B1 and %B2 bind to the -// symbols s0, s1 respectively of the layout map #tiled_dynamic. Data tiles of -// size %B1 x %B2 in the logical space will be stored contiguously in memory. -// The allocation size will be (%M ceildiv %B1) * %B1 * (%N ceildiv %B2) * %B2 -// f32 elements. -%T = alloc(%M, %N) [%B1, %B2] : memref - -// A memref that has a two-element padding at either end. The allocation size -// will fit 16 * 64 float elements of data. -%P = alloc() : memref<16x64xf32, #padded> - -// Affine map with symbol 's0' used as offset for the first dimension. -#imapS = affine_map<(d0, d1) [s0] -> (d0 + s0, d1)> -// Allocate memref and bind the following symbols: -// '%n' is bound to the dynamic second dimension of the memref type. -// '%o' is bound to the symbol 's0' in the affine map of the memref type. -%n = ... -%o = ... -%A = alloc (%n)[%o] : <16x?xf32, #imapS> -``` - -##### Index Space - -A memref dimension list defines an index space within which the memref can be -indexed to access data. - -##### Index - -Data is accessed through a memref type using a multidimensional index into the -multidimensional index space defined by the memref's dimension list. - -Examples - -```mlir -// Allocates a memref with 2D index space: -// { (i, j) : 0 <= i < 16, 0 <= j < 32 } -%A = alloc() : memref<16x32xf32, #imapA> - -// Loads data from memref '%A' using a 2D index: (%i, %j) -%v = load %A[%i, %j] : memref<16x32xf32, #imapA> -``` - -##### Index Map - -An index map is a one-to-one -[semi-affine map](Dialects/Affine.md#semi-affine-maps) that transforms a -multidimensional index from one index space to another. For example, the -following figure shows an index map which maps a 2-dimensional index from a 2x2 -index space to a 3x3 index space, using symbols `S0` and `S1` as offsets. - -![Index Map Example](/includes/img/index-map.svg) - -The number of domain dimensions and range dimensions of an index map can be -different, but must match the number of dimensions of the input and output index -spaces on which the map operates. The index space is always non-negative and -integral. In addition, an index map must specify the size of each of its range -dimensions onto which it maps. Index map symbols must be listed in order with -symbols for dynamic dimension sizes first, followed by other required symbols. - -##### Layout Map - -A layout map is a [semi-affine map](Dialects/Affine.md#semi-affine-maps) which -encodes logical to physical index space mapping, by mapping input dimensions to -their ordering from most-major (slowest varying) to most-minor (fastest -varying). Therefore, an identity layout map corresponds to a row-major layout. -Identity layout maps do not contribute to the MemRef type identification and are -discarded on construction. That is, a type with an explicit identity map is -`memref(i,j)>` is strictly the same as the one without layout -maps, `memref`. - -Layout map examples: - -```mlir -// MxN matrix stored in row major layout in memory: -#layout_map_row_major = (i, j) -> (i, j) - -// MxN matrix stored in column major layout in memory: -#layout_map_col_major = (i, j) -> (j, i) - -// MxN matrix stored in a 2-d blocked/tiled layout with 64x64 tiles. -#layout_tiled = (i, j) -> (i floordiv 64, j floordiv 64, i mod 64, j mod 64) -``` - -##### Affine Map Composition - -A memref specifies a semi-affine map composition as part of its type. A -semi-affine map composition is a composition of semi-affine maps beginning with -zero or more index maps, and ending with a layout map. The composition must be -conformant: the number of dimensions of the range of one map, must match the -number of dimensions of the domain of the next map in the composition. - -The semi-affine map composition specified in the memref type, maps from accesses -used to index the memref in load/store operations to other index spaces (i.e. -logical to physical index mapping). Each of the -[semi-affine maps](Dialects/Affine.md) and thus its composition is required to -be one-to-one. - -The semi-affine map composition can be used in dependence analysis, memory -access pattern analysis, and for performance optimizations like vectorization, -copy elision and in-place updates. If an affine map composition is not specified -for the memref, the identity affine map is assumed. - -##### Strided MemRef - -A memref may specify strides as part of its type. A stride specification is a -list of integer values that are either static or `?` (dynamic case). Strides -encode the distance, in number of elements, in (linear) memory between -successive entries along a particular dimension. A stride specification is -syntactic sugar for an equivalent strided memref representation using -semi-affine maps. For example, `memref<42x16xf32, offset: 33, strides: [1, 64]>` -specifies a non-contiguous memory region of `42` by `16` `f32` elements such -that: - -1. the minimal size of the enclosing memory region must be `33 + 42 * 1 + 16 * - 64 = 1066` elements; -2. the address calculation for accessing element `(i, j)` computes `33 + i + - 64 * j` -3. the distance between two consecutive elements along the inner dimension is - `1` element and the distance between two consecutive elements along the - outer dimension is `64` elements. - -This corresponds to a column major view of the memory region and is internally -represented as the type `memref<42x16xf32, (i, j) -> (33 + i + 64 * j)>`. - -The specification of strides must not alias: given an n-D strided memref, -indices `(i1, ..., in)` and `(j1, ..., jn)` may not refer to the same memory -address unless `i1 == j1, ..., in == jn`. - -Strided memrefs represent a view abstraction over preallocated data. They are -constructed with special ops, yet to be introduced. Strided memrefs are a -special subclass of memrefs with generic semi-affine map and correspond to a -normalized memref descriptor when lowering to LLVM. - -#### None Type - -Syntax: - -``` -none-type ::= `none` -``` - -The `none` type is a unit type, i.e. a type with exactly one possible value, -where its value does not have a defined dynamic representation. - -#### Tensor Type - -Syntax: - -``` -tensor-type ::= `tensor` `<` dimension-list type `>` - -dimension-list ::= dimension-list-ranked | (`*` `x`) -dimension-list-ranked ::= (dimension `x`)* -dimension ::= `?` | decimal-literal -``` - -Values with tensor type represents aggregate N-dimensional data values, and -have a known element type. It may have an unknown rank (indicated by `*`) or may -have a fixed rank with a list of dimensions. Each dimension may be a static -non-negative decimal constant or be dynamically determined (indicated by `?`). - -The runtime representation of the MLIR tensor type is intentionally abstracted - -you cannot control layout or get a pointer to the data. For low level buffer -access, MLIR has a [`memref` type](#memref-type). This abstracted runtime -representation holds both the tensor data values as well as information about -the (potentially dynamic) shape of the tensor. The -[`dim` operation](Dialects/Standard.md#dim-operation) returns the size of a -dimension from a value of tensor type. - -Note: hexadecimal integer literals are not allowed in tensor type declarations -to avoid confusion between `0xf32` and `0 x f32`. Zero sizes are allowed in -tensors and treated as other sizes, e.g., `tensor<0 x 1 x i32>` and `tensor<1 x -0 x i32>` are different types. Since zero sizes are not allowed in some other -types, such tensors should be optimized away before lowering tensors to vectors. - -Examples: - -```mlir -// Tensor with unknown rank. -tensor<* x f32> - -// Known rank but unknown dimensions. -tensor - -// Partially known dimensions. -tensor - -// Full static shape. -tensor<17 x 4 x 13 x 4 x f32> - -// Tensor with rank zero. Represents a scalar. -tensor - -// Zero-element dimensions are allowed. -tensor<0 x 42 x f32> - -// Zero-element tensor of f32 type (hexadecimal literals not allowed here). -tensor<0xf32> -``` - -#### Tuple Type - -Syntax: - -``` -tuple-type ::= `tuple` `<` (type ( `,` type)*)? `>` -``` - -The value of `tuple` type represents a fixed-size collection of elements, where -each element may be of a different type. - -**Rationale:** Though this type is first class in the type system, MLIR provides -no standard operations for operating on `tuple` types -([rationale](Rationale/Rationale.md#tuple-types)). - -Examples: - -```mlir -// Empty tuple. -tuple<> - -// Single element -tuple - -// Many elements. -tuple, i5> -``` - -#### Vector Type - -Syntax: - -``` -vector-type ::= `vector` `<` static-dimension-list vector-element-type `>` -vector-element-type ::= float-type | integer-type - -static-dimension-list ::= (decimal-literal `x`)+ -``` - -The vector type represents a SIMD style vector, used by target-specific -operation sets like AVX. While the most common use is for 1D vectors (e.g. -vector<16 x f32>) we also support multidimensional registers on targets that -support them (like TPUs). - -Vector shapes must be positive decimal integers. - -Note: hexadecimal integer literals are not allowed in vector type declarations, -`vector<0x42xi32>` is invalid because it is interpreted as a 2D vector with -shape `(0, 42)` and zero shapes are not allowed. +The [builtin dialect](Dialects/Builtin.md) defines a set of types that are +directly usable by any other dialect in MLIR. These types cover a range from +primitive integer and floating-point types, function types, and more. ## Attributes @@ -1401,263 +829,7 @@ ### Builtin Attribute Values -Builtin attributes are a core set of -[dialect attribute values](#dialect-attribute-values) that are defined in a -builtin dialect and thus available to all users of MLIR. - -``` -builtin-attribute ::= affine-map-attribute - | array-attribute - | bool-attribute - | dictionary-attribute - | elements-attribute - | float-attribute - | integer-attribute - | integer-set-attribute - | string-attribute - | symbol-ref-attribute - | type-attribute - | unit-attribute -``` - -#### AffineMap Attribute - -Syntax: - -``` -affine-map-attribute ::= `affine_map` `<` affine-map `>` -``` - -An affine-map attribute is an attribute that represents an affine-map object. - -#### Array Attribute - -Syntax: - -``` -array-attribute ::= `[` (attribute-value (`,` attribute-value)*)? `]` -``` - -An array attribute is an attribute that represents a collection of attribute -values. - -#### Boolean Attribute - -Syntax: - -``` -bool-attribute ::= bool-literal -``` - -A boolean attribute is a literal attribute that represents a one-bit boolean -value, true or false. - -#### Dictionary Attribute - -Syntax: - -``` -dictionary-attribute ::= `{` (attribute-entry (`,` attribute-entry)*)? `}` -``` - -A dictionary attribute is an attribute that represents a sorted collection of -named attribute values. The elements are sorted by name, and each name must be -unique within the collection. - -#### Elements Attributes - -Syntax: - -``` -elements-attribute ::= dense-elements-attribute - | opaque-elements-attribute - | sparse-elements-attribute -``` - -An elements attribute is a literal attribute that represents a constant -[vector](#vector-type) or [tensor](#tensor-type) value. - -##### Dense Elements Attribute - -Syntax: - -``` -dense-elements-attribute ::= `dense` `<` attribute-value `>` `:` - ( tensor-type | vector-type ) -``` - -A dense elements attribute is an elements attribute where the storage for the -constant vector or tensor value has been densely packed. The attribute supports -storing integer or floating point elements, with integer/index/floating element -types. It also support storing string elements with a custom dialect string -element type. - -##### Opaque Elements Attribute - -Syntax: - -``` -opaque-elements-attribute ::= `opaque` `<` dialect-namespace `,` - hex-string-literal `>` `:` - ( tensor-type | vector-type ) -``` - -An opaque elements attribute is an elements attribute where the content of the -value is opaque. The representation of the constant stored by this elements -attribute is only understood, and thus decodable, by the dialect that created -it. - -Note: The parsed string literal must be in hexadecimal form. - -##### Sparse Elements Attribute - -Syntax: - -``` -sparse-elements-attribute ::= `sparse` `<` attribute-value `,` attribute-value - `>` `:` ( tensor-type | vector-type ) -``` - -A sparse elements attribute is an elements attribute that represents a sparse -vector or tensor object. This is where very few of the elements are non-zero. - -The attribute uses COO (coordinate list) encoding to represent the sparse -elements of the elements attribute. The indices are stored via a 2-D tensor of -64-bit integer elements with shape [N, ndims], which specifies the indices of -the elements in the sparse tensor that contains non-zero values. The element -values are stored via a 1-D tensor with shape [N], that supplies the -corresponding values for the indices. - -Example: - -```mlir - sparse<[[0, 0], [1, 2]], [1, 5]> : tensor<3x4xi32> - -// This represents the following tensor: -/// [[1, 0, 0, 0], -/// [0, 0, 5, 0], -/// [0, 0, 0, 0]] -``` - -#### Float Attribute - -Syntax: - -``` -float-attribute ::= (float-literal (`:` float-type)?) - | (hexadecimal-literal `:` float-type) -``` - -A float attribute is a literal attribute that represents a floating point value -of the specified [float type](#floating-point-types). It can be represented in -the hexadecimal form where the hexadecimal value is interpreted as bits of the -underlying binary representation. This form is useful for representing infinity -and NaN floating point values. To avoid confusion with integer attributes, -hexadecimal literals _must_ be followed by a float type to define a float -attribute. - -Examples: - -``` -42.0 // float attribute defaults to f64 type -42.0 : f32 // float attribute of f32 type -0x7C00 : f16 // positive infinity -0x7CFF : f16 // NaN (one of possible values) -42 : f32 // Error: expected integer type -``` - -#### Integer Attribute - -Syntax: - -``` -integer-attribute ::= integer-literal ( `:` (index-type | integer-type) )? -``` - -An integer attribute is a literal attribute that represents an integral value of -the specified integer or index type. The default type for this attribute, if one -is not specified, is a 64-bit integer. - -##### Integer Set Attribute - -Syntax: - -``` -integer-set-attribute ::= `affine_set` `<` integer-set `>` -``` - -An integer-set attribute is an attribute that represents an integer-set object. - -#### String Attribute - -Syntax: - -``` -string-attribute ::= string-literal (`:` type)? -``` - -A string attribute is an attribute that represents a string literal value. - -#### Symbol Reference Attribute - -Syntax: - -``` -symbol-ref-attribute ::= symbol-ref-id (`::` symbol-ref-id)* -``` - -A symbol reference attribute is a literal attribute that represents a named -reference to an operation that is nested within an operation with the -`OpTrait::SymbolTable` trait. As such, this reference is given meaning by the -nearest parent operation containing the `OpTrait::SymbolTable` trait. It may -optionally contain a set of nested references that further resolve to a symbol -nested within a different symbol table. - -This attribute can only be held internally by -[array attributes](#array-attribute) and -[dictionary attributes](#dictionary-attribute)(including the top-level operation -attribute dictionary), i.e. no other attribute kinds such as Locations or -extended attribute kinds. - -**Rationale:** Identifying accesses to global data is critical to -enabling efficient multi-threaded compilation. Restricting global -data access to occur through symbols and limiting the places that can -legally hold a symbol reference simplifies reasoning about these data -accesses. - -See [`Symbols And SymbolTables`](SymbolsAndSymbolTables.md) for more -information. - -#### Type Attribute - -Syntax: - -``` -type-attribute ::= type -``` - -A type attribute is an attribute that represents a [type object](#type-system). - -#### Unit Attribute - -``` -unit-attribute ::= `unit` -``` - -A unit attribute is an attribute that represents a value of `unit` type. The -`unit` type allows only one value forming a singleton set. This attribute value -is used to represent attributes that only have meaning from their existence. - -One example of such an attribute could be the `swift.self` attribute. This -attribute indicates that a function parameter is the self/context parameter. It -could be represented as a [boolean attribute](#boolean-attribute)(true or -false), but a value of false doesn't really bring any value. The parameter -either is the self/context or it isn't. - -```mlir -// A unit attribute defined with the `unit` value specifier. -func @verbose_form(i1) attributes {dialectName.unitAttr = unit} - -// A unit attribute can also be defined without the value specifier. -func @simple_form(i1) attributes {dialectName.unitAttr} -``` +The [builtin dialect](Dialects/Builtin.md) defines a set of attribute values +that are directly usable by any other dialect in MLIR. These types cover a range +from primitive integer and floating-point values, attribute dictionaries, dense +multi-dimensional arrays, and more. diff --git a/mlir/include/mlir/IR/BuiltinTypes.td b/mlir/include/mlir/IR/BuiltinTypes.td --- a/mlir/include/mlir/IR/BuiltinTypes.td +++ b/mlir/include/mlir/IR/BuiltinTypes.td @@ -131,7 +131,6 @@ The function type can be thought of as a function signature. It consists of a list of formal parameter types and a list of formal result types. - ``` }]; let parameters = (ins "ArrayRef":$inputs, "ArrayRef":$results); let builders = [ diff --git a/mlir/include/mlir/IR/CMakeLists.txt b/mlir/include/mlir/IR/CMakeLists.txt --- a/mlir/include/mlir/IR/CMakeLists.txt +++ b/mlir/include/mlir/IR/CMakeLists.txt @@ -26,4 +26,7 @@ mlir_tablegen(BuiltinTypes.cpp.inc -gen-typedef-defs) add_public_tablegen_target(MLIRBuiltinTypesIncGen) -add_mlir_doc(BuiltinOps -gen-dialect-doc Builtin Dialects/) +add_mlir_doc(BuiltinAttributes -gen-attrdef-doc BuiltinAttributes Dialects/) +add_mlir_doc(BuiltinLocationAttributes -gen-attrdef-doc BuiltinLocationAttributes Dialects/) +add_mlir_doc(BuiltinOps -gen-op-doc BuiltinOps Dialects/) +add_mlir_doc(BuiltinTypes -gen-typedef-doc BuiltinTypes Dialects/) diff --git a/mlir/tools/mlir-tblgen/OpDocGen.cpp b/mlir/tools/mlir-tblgen/OpDocGen.cpp --- a/mlir/tools/mlir-tblgen/OpDocGen.cpp +++ b/mlir/tools/mlir-tblgen/OpDocGen.cpp @@ -162,46 +162,51 @@ // TypeDef Documentation //===----------------------------------------------------------------------===// -/// Emit the assembly format of a type. -static void emitTypeAssemblyFormat(TypeDef td, raw_ostream &os) { +static void emitAttrOrTypeDefAssemblyFormat(const AttrOrTypeDef &def, + raw_ostream &os) { SmallVector parameters; - td.getParameters(parameters); - if (parameters.size() == 0) { - os << "\nSyntax: `!" << td.getDialect().getName() << "." << td.getMnemonic() - << "`\n"; + def.getParameters(parameters); + if (parameters.empty()) { + os << "\nSyntax: `!" << def.getDialect().getName() << "." + << def.getMnemonic() << "`\n"; return; } - os << "\nSyntax:\n\n```\n!" << td.getDialect().getName() << "." - << td.getMnemonic() << "<\n"; - for (auto *it = parameters.begin(), *e = parameters.end(); it < e; ++it) { - os << " " << it->getSyntax(); - if (it < parameters.end() - 1) + os << "\nSyntax:\n\n```\n!" << def.getDialect().getName() << "." + << def.getMnemonic() << "<\n"; + for (auto it : llvm::enumerate(parameters)) { + const AttrOrTypeParameter ¶m = it.value(); + os << " " << param.getSyntax(); + if (it.index() < (parameters.size() - 1)) os << ","; - os << " # " << it->getName() << "\n"; + os << " # " << param.getName() << "\n"; } os << ">\n```\n"; } -static void emitTypeDefDoc(TypeDef td, raw_ostream &os) { - os << llvm::formatv("### `{0}` ({1})\n", td.getName(), td.getCppClassName()); +static void emitAttrOrTypeDefDoc(const AttrOrTypeDef &def, raw_ostream &os) { + os << llvm::formatv("### {0}\n", def.getCppClassName()); - // Emit the summary, syntax, and description if present. - if (td.hasSummary()) - os << "\n" << td.getSummary() << "\n"; - if (td.getMnemonic() && td.getPrinterCode() && *td.getPrinterCode() == "" && - td.getParserCode() && *td.getParserCode() == "") - emitTypeAssemblyFormat(td, os); - if (td.hasDescription()) { + // Emit the summary if present. + if (def.hasSummary()) + os << "\n" << def.getSummary() << "\n"; + + // Emit the syntax if present. + if (def.getMnemonic() && def.getPrinterCode() == StringRef() && + def.getParserCode() == StringRef()) + emitAttrOrTypeDefAssemblyFormat(def, os); + + // Emit the description if present. + if (def.hasDescription()) { os << "\n"; - mlir::tblgen::emitDescription(td.getDescription(), os); + mlir::tblgen::emitDescription(def.getDescription(), os); } - // Emit attribute documentation. + // Emit parameter documentation. SmallVector parameters; - td.getParameters(parameters); + def.getParameters(parameters); if (!parameters.empty()) { - os << "\n#### Type parameters:\n\n"; + os << "\n#### Parameters:\n\n"; os << "| Parameter | C++ type | Description |\n" << "| :-------: | :-------: | ----------- |\n"; for (const auto &it : parameters) { @@ -214,24 +219,35 @@ os << "\n"; } +static void emitAttrOrTypeDefDoc(const RecordKeeper &recordKeeper, + raw_ostream &os, StringRef recordTypeName) { + std::vector defs = + recordKeeper.getAllDerivedDefinitions(recordTypeName); + + os << "\n"; + for (const llvm::Record *def : defs) + emitAttrOrTypeDefDoc(AttrOrTypeDef(def), os); +} + //===----------------------------------------------------------------------===// // Dialect Documentation //===----------------------------------------------------------------------===// -static void emitDialectDoc(const Dialect &dialect, ArrayRef ops, - ArrayRef types, ArrayRef typeDefs, - raw_ostream &os) { - os << "# "; - if (dialect.getName().empty()) - os << "Builtin"; - else - os << "'" << dialect.getName() << "'"; - os << " Dialect\n\n"; +static void emitDialectDoc(const Dialect &dialect, ArrayRef attrDefs, + ArrayRef ops, ArrayRef types, + ArrayRef typeDefs, raw_ostream &os) { + os << "# '" << dialect.getName() << "' Dialect\n\n"; emitIfNotEmpty(dialect.getSummary(), os); emitIfNotEmpty(dialect.getDescription(), os); os << "[TOC]\n\n"; + if (!attrDefs.empty()) { + os << "## Attribute definition\n\n"; + for (const AttrDef &def : attrDefs) + emitAttrOrTypeDefDoc(def, os); + } + // TODO: Add link between use and def for types if (!types.empty()) { os << "## Type constraint definition\n\n"; @@ -247,46 +263,68 @@ if (!typeDefs.empty()) { os << "## Type definition\n\n"; - for (const TypeDef &td : typeDefs) - emitTypeDefDoc(td, os); + for (const TypeDef &def : typeDefs) + emitAttrOrTypeDefDoc(def, os); } } static void emitDialectDoc(const RecordKeeper &recordKeeper, raw_ostream &os) { - const auto &opDefs = recordKeeper.getAllDerivedDefinitions("Op"); - const auto &typeDefs = recordKeeper.getAllDerivedDefinitions("DialectType"); - const auto &typeDefDefs = recordKeeper.getAllDerivedDefinitions("TypeDef"); + std::vector opDefs = recordKeeper.getAllDerivedDefinitions("Op"); + std::vector typeDefs = + recordKeeper.getAllDerivedDefinitions("DialectType"); + std::vector typeDefDefs = + recordKeeper.getAllDerivedDefinitions("TypeDef"); + std::vector attrDefDefs = + recordKeeper.getAllDerivedDefinitions("AttrDef"); std::set dialectsWithDocs; - std::map> dialectOps; - std::map> dialectTypes; - std::map> dialectTypeDefs; + + llvm::StringMap> dialectAttrDefs; + llvm::StringMap> dialectOps; + llvm::StringMap> dialectTypes; + llvm::StringMap> dialectTypeDefs; + for (auto *attrDef : attrDefDefs) { + AttrDef attr(attrDef); + dialectAttrDefs[attr.getDialect().getName()].push_back(attr); + dialectsWithDocs.insert(attr.getDialect()); + } for (auto *opDef : opDefs) { Operator op(opDef); - dialectOps[op.getDialect()].push_back(op); + dialectOps[op.getDialect().getName()].push_back(op); dialectsWithDocs.insert(op.getDialect()); } for (auto *typeDef : typeDefs) { Type type(typeDef); if (auto dialect = type.getDialect()) - dialectTypes[dialect].push_back(type); + dialectTypes[dialect.getName()].push_back(type); } for (auto *typeDef : typeDefDefs) { TypeDef type(typeDef); - dialectTypeDefs[type.getDialect()].push_back(type); + dialectTypeDefs[type.getDialect().getName()].push_back(type); dialectsWithDocs.insert(type.getDialect()); } os << "\n"; - for (auto dialect : dialectsWithDocs) - emitDialectDoc(dialect, dialectOps[dialect], dialectTypes[dialect], - dialectTypeDefs[dialect], os); + for (const Dialect &dialect : dialectsWithDocs) { + StringRef dialectName = dialect.getName(); + emitDialectDoc(dialect, dialectAttrDefs[dialectName], + dialectOps[dialectName], dialectTypes[dialectName], + dialectTypeDefs[dialectName], os); + } } //===----------------------------------------------------------------------===// // Gen Registration //===----------------------------------------------------------------------===// +static mlir::GenRegistration + genAttrRegister("gen-attrdef-doc", + "Generate dialect attribute documentation", + [](const RecordKeeper &records, raw_ostream &os) { + emitAttrOrTypeDefDoc(records, os, "AttrDef"); + return false; + }); + static mlir::GenRegistration genOpRegister("gen-op-doc", "Generate dialect documentation", [](const RecordKeeper &records, raw_ostream &os) { @@ -294,6 +332,13 @@ return false; }); +static mlir::GenRegistration + genTypeRegister("gen-typedef-doc", "Generate dialect type documentation", + [](const RecordKeeper &records, raw_ostream &os) { + emitAttrOrTypeDefDoc(records, os, "TypeDef"); + return false; + }); + static mlir::GenRegistration genRegister("gen-dialect-doc", "Generate dialect documentation", [](const RecordKeeper &records, raw_ostream &os) {