diff --git a/mlir/docs/LangRef.md b/mlir/docs/LangRef.md --- a/mlir/docs/LangRef.md +++ b/mlir/docs/LangRef.md @@ -1457,15 +1457,16 @@ [array attributes](#array-attribute) and [dictionary attributes](#dictionary-attribute)(including the top-level operation attribute dictionary), i.e. no other attribute kinds such as Locations or -extended attribute kinds. If a reference to a symbol is necessary from outside -of the symbol table that the symbol is defined in, a -[string attribute](#string-attribute) can be used to refer to the symbol name. +extended attribute kinds. **Rationale:** Given that MLIR models global accesses with symbol references, to enable efficient multi-threading, it becomes difficult to effectively reason about their uses. By restricting the places that can legally hold a symbol reference, we can always opaquely reason about a symbols usage characteristics. +See [`Symbols And SymbolTables`](SymbolsAndSymbolTables.md) for more +information. + #### Type Attribute Syntax: diff --git a/mlir/docs/SymbolsAndSymbolTables.md b/mlir/docs/SymbolsAndSymbolTables.md new file mode 100644 --- /dev/null +++ b/mlir/docs/SymbolsAndSymbolTables.md @@ -0,0 +1,165 @@ +# Symbols and Symbol Tables + +[TOC] + +MLIR is a multi-level representation, with [Regions](LangRef.md#regions) the +multi-level aspect is structural in the IR. A lot of infrastructure within the +compiler is built around this nesting structure, including the processing of +operations within the [pass manager](WritingAPass.md#pass-manager). One nice +aspect of MLIR is that it is able to process operations in parallel, utilizing +multiple threads. This is possible due to a property of the IR known as +`IsolatedFromAbove`. This property asserts that the regions of an operation will +not capture, or reference, SSA values defined above the region scope. This means +that the following is invalid if `foo.region_op` is defined as +`IsolatedFromAbove`: + +```mlir +%result = constant 10 : i32 +foo.region_op { + foo.yield %result : i32 +} +``` + +(TODO: Link to an explicit section detailing `IsolatedFromAbove` instead) + +Without this property, any operation could affect or mutate the use-list of +operations defined above. Making this thread-safe requires expensive locking in +some of the core IR data structures, which becomes quite inefficient. To enable +this, MLIR uses local pools for constant values as well as `Symbol` accesses for +global values and variables. This document details the design of `Symbols`, what +they are and how they fit into the system. + +[TOC] + +The `Symbol` infrastructure essentially provides a non-SSA mechanism in which to +refer to an operation symbolically with a name. This allows for referring to +operations defined above regions that were defined as `IsolatedFromAbove` in a +safe way. + +## Symbols + +A `Symbol` is a named operation that resides within a region that defines a +[`SymbolTable`](#symbol-table). The name of a symbol *must* be unique within the +parent `SymbolTable`. This name is semantically similarly to an SSA result +value, and may be referred to by other operations to provide a symbolic link, or +use, to the symbol. An example of a `Symbol` operation is +[`func`](LangRef.md#functions). `func` defines a symbol name, which is +[referred to](#referencing-a-symbol) by operations like +[`std.call`](Dialects/Standard.md#call). + +### Defining a Symbol + +An `Symbol` operation may use the `OpTrait::Symbol` trait, and *must* adhere to +the following: + +* Have a `StringAttr` attribute named + 'SymbolTable::getSymbolAttrName()'(`sym_name`). + - This attribute defines the symbolic 'name' of the operation. +* Have an optional `StringAttr` attribute named + 'SymbolTable::getVisibilityAttrName()'(`sym_visibility`) + - This attribute defines the [visibility](#symbol-visibility) of the + symbol, or more specifically in-which scopes it may be accessed. +* Must not produce any SSA results + - Intermixing the different ways to `use` an operation quickly becomes + unwieldy and difficult to analyze. + +### Referencing a Symbol + +`Symbols` are referenced symbolically by name via the +[`SymbolRefAttr`](LangRef.md#symbol-reference-attribute) attribute. Using an +attribute, as opposed to an SSA value, has the added benefit of allowing for +references in more places than the operand list; including +[nested attribute dictionaries](LangRef.md#dictionary-attribute), +[array attributes](LangRef.md#array-attribute), etc. The general impact of this +is that dialects may need to ensure that their operations support `SymbolRefs` +and SSA values, or provide operations that materializes an SSA value from a +symbol reference. Each have different trade offs depending on the situation. A +function call may directly use a `SymbolRef` as the callee, whereas a reference +to a global variable might use a materialization operation so that the variable +can be used in other operations like `std.addi`. +[`llvm.mlir.addressof`](Dialects/LLVM.md#llvmmliraddressof) is one example of +such an operation. + +See the `LangRef` definition of the +[`SymbolRefAttr`](LangRef.md#symbol-reference-attribute) for more information +about the structure of this attribute. + +### Manipulating a Symbol + +As described above, `Symbols` act as an auxiliary way of defining uses of +operations to the traditional SSA use-list. As such, it is imperative to provide +similar functionality to manipulate and inspect the list of uses and the users. +The following are a few of the utilities provided by the `SymbolTable`: + +* `SymbolTable::getSymbolUses` + + - Access an iterator range over all of the uses on and nested within a + particular operation. + +* `SymbolTable::symbolKnownUseEmpty` + + - Check if a particular symbol is known to be unused within a specific + section of the IR. + +* `SymbolTable::replaceAllSymbolUses` + + - Replace all of the uses of one symbol with a new one within a specific + section of the IR. + +* `SymbolTable::lookupNearestSymbolFrom` + + - Lookup the definition of a symbol in the nearest symbol table from some + anchor operation. + +## Symbol Table + +Described above are `Symbols`, which reside within a region of an operation +defining a `SymbolTable`. A `SymbolTable` operation provides the container for +the [`Symbol`](#symbols) operations. It verifies that all `Symbol` operations +have a unique name, and provides facilities for looking up symbols by name. +Operations defining a `SymbolTable` may use the `OpTrait::SymbolTable` trait. + +## Symbol Visibility + +Along with a name, a `Symbol` also has a `visibility` attached to it. The +`visibility` of a symbol defines its structural reachability within the IR. A +symbol may have one of the following visibilities: + +* Public + + - The symbol may be referenced from outside of the visible IR. We cannot + assume that all of the uses of this symbol are observable. + +* Private + + - The symbol may only be referenced from within the current symbol table. + +* Nested + + - The symbol may be referenced by operations outside of the current symbol + table, but not outside of the visible IR, as long as each symbol table + parent also defines a non-private symbol. + +A few examples of what this looks like in the IR are shown below: + +```mlir +module @public_module { + // This function can be accessed by 'live.user' + func @nested_function() attributes { sym_visibility = "nested" } + + // This function cannot be accessed outside of 'public_module' + func @private_function() attributes { sym_visibility = "private" } +} + +// This function can only be accessed from within the top-level module +func @private_function() attributes { sym_visibility = "private" } + +// This function may be referenced externally +func @public_function() + +"live.user"() {uses = [ + @public_module::@nested_function, + @private_function, + @public_function +]} : () -> () +```