diff --git a/mlir/docs/ConversionToLLVMDialect.md b/mlir/docs/ConversionToLLVMDialect.md
deleted file mode 100644
--- a/mlir/docs/ConversionToLLVMDialect.md
+++ /dev/null
@@ -1,284 +0,0 @@
-# Conversion to the LLVM Dialect
-
-Conversion from several dialects that rely on
-[built-in types](LangRef.md/#builtin-types) to the
-[LLVM Dialect](Dialects/LLVM.md) is expected to be performed through the
-[Dialect Conversion](DialectConversion.md) infrastructure.
-
-The conversion of types and that of the overall module structure is described in
-this document. Individual conversion passes provide a set of conversion patterns
-for ops in different dialects, such as `-convert-std-to-llvm` for ops in the
-[Standard dialect](Dialects/Standard.md) and `-convert-vector-to-llvm` in the
-[Vector dialect](Dialects/Vector.md). *Note that some conversions subsume the
-others.*
-
-We use the terminology defined by the
-[LLVM Dialect description](Dialects/LLVM.md) throughout this document.
-
-[TOC]
-
-## Type Conversion
-
-### Scalar Types
-
-Scalar types are converted to their LLVM counterparts if they exist. The
-following conversions are currently implemented:
-
--   `i*` converts to `!llvm.i*`
--   `bf16` converts to `bf16`
--   `f16` converts to `f16`
--   `f32` converts to `f32`
--   `f64` converts to `f64`
--   `f80` converts to `f80`
--   `f128` converts to `f128`
-
-### Index Type
-
-Index type is converted to an LLVM dialect integer type with bitwidth equal to
-the bitwidth of the pointer size as specified by the
-[data layout](Dialects/LLVM.md/#data-layout-and-triple) of the closest module.
-For example, on x86-64 CPUs it converts to `i64`. This behavior can be
-overridden by the type converter configuration, which is often exposed as a pass
-option by conversion passes.
-
-### Vector Types
-
-LLVM IR only supports *one-dimensional* vectors, unlike MLIR where vectors can
-be multi-dimensional. Vector types cannot be nested in either IR. In the
-one-dimensional case, MLIR vectors are converted to LLVM IR vectors of the same
-size with element type converted using these conversion rules. In the
-n-dimensional case, MLIR vectors are converted to (n-1)-dimensional array types
-of one-dimensional vectors.
-
-For example, `vector<4xf32>` converts to `vector<4xf32>` and `vector<4 x 8 x 16
-x f32>` converts to `!llvm.array<4 x array<8 x vec<16 x f32>>>`.
-
-### Ranked Memref Types
-
-Memref types in MLIR have both static and dynamic information associated with
-them. In the general case, the dynamic information describes dynamic sizes in
-the logical indexing space and any symbols bound to the memref. This dynamic
-information must be present at runtime in the LLVM dialect equivalent type.
-
-In practice, the conversion supports two conventions:
-
--   the default convention for memrefs in the
-    **[strided form](Dialects/Builtin.md/#strided-memref)**;
--   a "bare pointer" conversion for statically-shaped memrefs with default
-    layout.
-
-The choice between conventions is specified at type converter construction time
-and is often exposed as an option by conversion passes.
-
-Memrefs with arbitrary layouts are not supported. Instead, these layouts can be
-factored out of the type and used as part of index computation for operations
-that read and write into a memref with the default layout.
-
-#### Default Convention
-
-The dynamic information comprises the buffer pointer as well as sizes and
-strides of any dynamically-sized dimensions. Memref types are normalized and
-converted to a _descriptor_ that is only dependent on the rank of the memref.
-The descriptor contains the following fields in order:
-
-1.  The pointer to the data buffer as allocated, referred to as "allocated
-    pointer". This is only useful for deallocating the memref.
-2.  The pointer to the properly aligned data pointer that the memref indexes,
-    referred to as "aligned pointer".
-3.  A lowered converted `index`-type integer containing the distance in number
-    of elements between the beginning of the (aligned) buffer and the first
-    element to be accessed through the memref, referred to as "offset".
-4.  An array containing as many converted `index`-type integers as the rank of
-    the memref: the array represents the size, in number of elements, of the
-    memref along the given dimension. For constant memref dimensions, the
-    corresponding size entry is a constant whose runtime value must match the
-    static value.
-5.  A second array containing as many converted `index`-type integers as the
-    rank of memref: the second array represents the "stride" (in tensor
-    abstraction sense), i.e. the number of consecutive elements of the
-    underlying buffer one needs to jump over to get to the next logically
-    indexed element.
-
-For constant memref dimensions, the corresponding size entry is a constant whose
-runtime value matches the static value. This normalization serves as an ABI for
-the memref type to interoperate with externally linked functions. In the
-particular case of rank `0` memrefs, the size and stride arrays are omitted,
-resulting in a struct containing two pointers + offset.
-
-Examples:
-
-```mlir
-memref<f32> -> !llvm.struct<(ptr<f32> , ptr<f32>, i64)>
-memref<1 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64,
-                                 array<1 x 64>, array<1 x i64>)>
-memref<? x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64
-                                 array<1 x 64>, array<1 x i64>)>
-memref<10x42x42x43x123 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64
-                                               array<5 x 64>, array<5 x i64>)>
-memref<10x?x42x?x123 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64
-                                             array<5 x 64>, array<5 x i64>)>
-
-// Memref types can have vectors as element types
-memref<1x? x vector<4xf32>> -> !llvm.struct<(ptr<vec<4 x f32>>,
-                                             ptr<vec<4 x float>>, i64,
-                                             array<1 x i64>, array<1 x i64>)>
-```
-
-#### Bare Pointer Convention
-
-Ranked memrefs with static shape and default layout can be converted into an
-LLVM dialect pointer to their element type. Only the default alignment is
-supported in such cases, e.g. the `alloc` operation cannot have an alignment
-attribute.
-
-Examples:
-
-```mlir
-memref<f32> -> !llvm.ptr<f32>
-memref<10x42 x f32> -> !llvm.ptr<f32>
-
-// Memrefs with vector types are also supported.
-memref<10x42 x vector<4xf32>> -> !llvm.ptr<vec<4 x f32>>
-```
-
-### Unranked Memref types
-
-Unranked memrefs are converted to an unranked descriptor that contains:
-
-1.  a converted `index`-typed integer representing the dynamic rank of the
-    memref;
-2.  a type-erased pointer (`!llvm.ptr<i8>`) to a ranked memref descriptor with
-    the contents listed above.
-
-This descriptor is primarily intended for interfacing with rank-polymorphic
-library functions. The pointer to the ranked memref descriptor points to memory
-_allocated on stack_ of the function in which it is used.
-
-Note that stack allocations may be emitted at a location where the unranked
-memref first appears, e.g., a cast operation, and remain live throughout the
-lifetime of the function; this may lead to stack exhaustion if used in a loop.
-
-Examples:
-
-```mlir
-// Unranked descriptor.
-memref<*xf32> -> !llvm.struct<(i64, ptr<i8>)>
-```
-
-Bare pointer convention does not support unranked memrefs.
-
-### Function Types
-
-Function types get converted to LLVM dialect function types. The arguments are
-converted individually according to these rules, except for `memref` types in
-function arguments and high-order functions, which are described below. The
-result types need to accommodate the fact that LLVM functions always have a
-return type, which may be an `!llvm.void` type. The converted function always
-has a single result type. If the original function type had no results, the
-converted function will have one result of the `!llvm.void` type. If the
-original function type had one result, the converted function will also have one
-result converted using these rules. Otherwise, the result type will be an LLVM
-dialect structure type where each element of the structure corresponds to one of
-the results of the original function, converted using these rules.
-
-Examples:
-
-```mlir
-// Zero-ary function type with no results:
-() -> ()
-// is converted to a zero-ary function with `void` result.
-!llvm.func<void ()>
-
-// Unary function with one result:
-(i32) -> (i64)
-// has its argument and result type converted, before creating the LLVM dialect
-// function type.
-!llvm.func<i64 (i32)>
-
-// Binary function with one result:
-(i32, f32) -> (i64)
-// has its arguments handled separately
-!llvm.func<i64 (i32, f32)>
-
-// Binary function with two results:
-(i32, f32) -> (i64, f64)
-// has its result aggregated into a structure type.
-!llvm.func<struct<(i64, f64)> (i32, f32)>
-```
-
-#### Functions as Function Arguments or Results
-
-High-order function types, i.e. types of functions that have other functions as
-arguments or results, are converted differently to accommodate the fact that
-LLVM IR does not allow for function-typed values. Instead, functions are
-expected to be passed into and return from other functions _by pointer_.
-Therefore, function-typed function arguments are results are converted to
-pointer-to-the-function type. The pointee type is converted using these rules.
-
-Examples:
-
-```mlir
-// Function-typed arguments or results in higher-order functions:
-(() -> ()) -> (() -> ())
-// are converted into pointers to functions.
-!llvm.func<ptr<func<void ()>> (ptr<func<void ()>>)>
-
-// These rules apply recursively: a function type taking a function that takes
-// another function
-( ( (i32) -> (i64) ) -> () ) -> ()
-// is converted into a function type taking a pointer-to-function that takes
-// another point-to-function.
-!llvm.func<void (ptr<func<void (ptr<func<i64 (i32)>>)>>)>
-```
-
-#### Memrefs as Function Arguments
-
-When used as function arguments, both ranked and unranked memrefs are converted
-into a list of arguments that represents each _scalar_ component of their
-descriptor. This is intended for some compatibility with C ABI, in which
-structure types would need to be passed by-pointer leading to the need for
-allocations and related issues, as well as for aliasing annotations, which are
-currently attached to pointer in function arguments. Having scalar components
-means that each size and stride is passed as an individual value.
-
-When used as function results, memrefs are converted as usual, i.e. each memref
-is converted to a descriptor struct (default convention) or to a pointer (bare
-pointer convention).
-
-Examples:
-
-```mlir
-// A memref descriptor appearing as function argument:
-(memref<f32>) -> ()
-// gets converted into a list of individual scalar components of a descriptor.
-!llvm.func<void (ptr<f32>, ptr<f32>, i64)>
-
-// The list of arguments is linearized and one can freely mix memref and other
-// types in this list:
-(memref<f32>, f32) -> ()
-// which gets converted into a flat list.
-!llvm.func<void (ptr<f32>, ptr<f32>, i64, f32)>
-
-// For nD ranked memref descriptors:
-(memref<?x?xf32>) -> ()
-// the converted signature will contain 2n+1 `index`-typed integer arguments,
-// offset, n sizes and n strides, per memref argument type.
-!llvm.func<void (ptr<f32>, ptr<f32>, i64, i64, i64, i64, i64)>
-
-// Same rules apply to unranked descriptors:
-(memref<*xf32>) -> ()
-// which get converted into their components.
-!llvm.func<void (i64, ptr<i8>)>
-
-// However, returning a memref from a function is not affected:
-() -> (memref<?xf32>)
-// gets converted to a function returning a descriptor structure.
-!llvm.func<struct<(ptr<f32>, ptr<f32>, i64, array<1xi64>, array<1xi64>)> ()>
-
-// If multiple memref-typed results are returned:
-() -> (memref<f32>, memref<f64>)
-// their descriptor structures are additionally packed into another structure,
-// potentially with other non-memref typed results.
-!llvm.func<struct<(struct<(ptr<f32>, ptr<f32>, i64)>,
-                   struct<(ptr<double>, ptr<double>, i64)>)> ()>
-```
diff --git a/mlir/docs/LLVMDialectMemRefConvention.md b/mlir/docs/LLVMDialectMemRefConvention.md
deleted file mode 100644
--- a/mlir/docs/LLVMDialectMemRefConvention.md
+++ /dev/null
@@ -1,494 +0,0 @@
-# Built-in Function and MemRef Calling Convention
-
-This documents describes the calling convention implemented in the conversion of
-built-in [function operation](Dialects/Builtin.md/#func-mlirfuncop), standard
-[`call`](Dialects/Standard.md/#stdcall-callop) operations and the handling of
-[`memref`](Dialects/Builtin.md#memreftype) type equivalents in the
-[LLVM dialect](Dialects/LLVM.md). The conversion assumes the _default_
-convention was used when converting
-[built-in to the LLVM dialect types](ConversionToLLVMDialect.md).
-
-## Function Result Packing
-
-In case of multi-result functions, the returned values are inserted into a
-structure-typed value before being returned and extracted from it at the call
-site. This transformation is a part of the conversion and is transparent to the
-defines and uses of the values being returned.
-
-Example:
-
-```mlir
-func @foo(%arg0: i32, %arg1: i64) -> (i32, i64) {
-  return %arg0, %arg1 : i32, i64
-}
-func @bar() {
-  %0 = constant 42 : i32
-  %1 = constant 17 : i64
-  %2:2 = call @foo(%0, %1) : (i32, i64) -> (i32, i64)
-  "use_i32"(%2#0) : (i32) -> ()
-  "use_i64"(%2#1) : (i64) -> ()
-}
-
-// is transformed into
-
-llvm.func @foo(%arg0: i32, %arg1: i64) -> !llvm.struct<(i32, i64)> {
-  // insert the vales into a structure
-  %0 = llvm.mlir.undef : !llvm.struct<(i32, i64)>
-  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.struct<(i32, i64)>
-  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.struct<(i32, i64)>
-
-  // return the structure value
-  llvm.return %2 : !llvm.struct<(i32, i64)>
-}
-llvm.func @bar() {
-  %0 = llvm.mlir.constant(42 : i32) : i32
-  %1 = llvm.mlir.constant(17) : i64
-
-  // call and extract the values from the structure
-  %2 = llvm.call @bar(%0, %1)
-     : (i32, i32) -> !llvm.struct<(i32, i64)>
-  %3 = llvm.extractvalue %2[0] : !llvm.struct<(i32, i64)>
-  %4 = llvm.extractvalue %2[1] : !llvm.struct<(i32, i64)>
-
-  // use as before
-  "use_i32"(%3) : (i32) -> ()
-  "use_i64"(%4) : (i64) -> ()
-}
-```
-
-## Calling Convention for Ranked `memref`
-
-Function _arguments_ of `memref` type, ranked or unranked, are _expanded_ into a
-list of arguments of non-aggregate types that the memref descriptor defined
-above comprises. That is, the outer struct type and the inner array types are
-replaced with individual arguments.
-
-This convention is implemented in the conversion of `std.func` and `std.call` to
-the LLVM dialect, with the former unpacking the descriptor into a set of
-individual values and the latter packing those values back into a descriptor so
-as to make it transparently usable by other operations. Conversions from other
-dialects should take this convention into account.
-
-This specific convention is motivated by the necessity to specify alignment and
-aliasing attributes on the raw pointers underpinning the memref.
-
-Examples:
-
-```mlir
-func @foo(%arg0: memref<?xf32>) -> () {
-  "use"(%arg0) : (memref<?xf32>) -> ()
-  return
-}
-
-// Gets converted to the following
-// (using type alias for brevity):
-!llvm.memref_1d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
-                                     array<1xi64>, array<1xi64>)>
-
-llvm.func @foo(%arg0: !llvm.ptr<f32>,  // Allocated pointer.
-               %arg1: !llvm.ptr<f32>,  // Aligned pointer.
-               %arg2: i64,         // Offset.
-               %arg3: i64,         // Size in dim 0.
-               %arg4: i64) {       // Stride in dim 0.
-  // Populate memref descriptor structure.
-  %0 = llvm.mlir.undef :
-  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_1d
-  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_1d
-  %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_1d
-  %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_1d
-  %5 = llvm.insertvalue %arg4, %4[4, 0] : !llvm.memref_1d
-
-  // Descriptor is now usable as a single value.
-  "use"(%5) : (!llvm.memref_1d) -> ()
-  llvm.return
-}
-```
-
-```mlir
-func @bar() {
-  %0 = "get"() : () -> (memref<?xf32>)
-  call @foo(%0) : (memref<?xf32>) -> ()
-  return
-}
-
-// Gets converted to the following
-// (using type alias for brevity):
-!llvm.memref_1d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
-                                     array<1xi64>, array<1xi64>)>
-
-llvm.func @bar() {
-  %0 = "get"() : () -> !llvm.memref_1d
-
-  // Unpack the memref descriptor.
-  %1 = llvm.extractvalue %0[0] : !llvm.memref_1d
-  %2 = llvm.extractvalue %0[1] : !llvm.memref_1d
-  %3 = llvm.extractvalue %0[2] : !llvm.memref_1d
-  %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_1d
-  %5 = llvm.extractvalue %0[4, 0] : !llvm.memref_1d
-
-  // Pass individual values to the callee.
-  llvm.call @foo(%1, %2, %3, %4, %5) : (!llvm.memref_1d) -> ()
-  llvm.return
-}
-
-```
-
-## Calling Convention for Unranked `memref`
-
-For unranked memrefs, the list of function arguments always contains two
-elements, same as the unranked memref descriptor: an integer rank, and a
-type-erased (`!llvm<"i8*">`) pointer to the ranked memref descriptor. Note that
-while the _calling convention_ does not require stack allocation, _casting_ to
-unranked memref does since one cannot take an address of an SSA value containing
-the ranked memref. The caller is in charge of ensuring the thread safety and
-eventually removing unnecessary stack allocations in cast operations.
-
-Example
-
-```mlir
-llvm.func @foo(%arg0: memref<*xf32>) -> () {
-  "use"(%arg0) : (memref<*xf32>) -> ()
-  return
-}
-
-// Gets converted to the following.
-
-llvm.func @foo(%arg0: i64        // Rank.
-               %arg1: !llvm.ptr<i8>) { // Type-erased pointer to descriptor.
-  // Pack the unranked memref descriptor.
-  %0 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
-  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.struct<(i64, ptr<i8>)>
-  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.struct<(i64, ptr<i8>)>
-
-  "use"(%2) : (!llvm.struct<(i64, ptr<i8>)>) -> ()
-  llvm.return
-}
-```
-
-```mlir
-llvm.func @bar() {
-  %0 = "get"() : () -> (memref<*xf32>)
-  call @foo(%0): (memref<*xf32>) -> ()
-  return
-}
-
-// Gets converted to the following.
-
-llvm.func @bar() {
-  %0 = "get"() : () -> (!llvm.struct<(i64, ptr<i8>)>)
-
-  // Unpack the memref descriptor.
-  %1 = llvm.extractvalue %0[0] : !llvm.struct<(i64, ptr<i8>)>
-  %2 = llvm.extractvalue %0[1] : !llvm.struct<(i64, ptr<i8>)>
-
-  // Pass individual values to the callee.
-  llvm.call @foo(%1, %2) : (i64, !llvm.ptr<i8>)
-  llvm.return
-}
-```
-
-**Lifetime.** The second element of the unranked memref descriptor points to
-some memory in which the ranked memref descriptor is stored. By convention, this
-memory is allocated on stack and has the lifetime of the function. (*Note:* due
-to function-length lifetime, creation of multiple unranked memref descriptors,
-e.g., in a loop, may lead to stack overflows.) If an unranked descriptor has to
-be returned from a function, the ranked descriptor it points to is copied into
-dynamically allocated memory, and the pointer in the unranked descriptor is
-updated accordingly. The allocation happens immediately before returning. It is
-the responsibility of the caller to free the dynamically allocated memory. The
-default conversion of `std.call` and `std.call_indirect` copies the ranked
-descriptor to newly allocated memory on the caller's stack. Thus, the convention
-of the ranked memref descriptor pointed to by an unranked memref descriptor
-being stored on stack is respected.
-
-*This convention may or may not apply if the conversion of MemRef types is
-overridden by the user.*
-
-## C-compatible wrapper emission
-
-In practical cases, it may be desirable to have externally-facing functions with
-a single attribute corresponding to a MemRef argument. When interfacing with
-LLVM IR produced from C, the code needs to respect the corresponding calling
-convention. The conversion to the LLVM dialect provides an option to generate
-wrapper functions that take memref descriptors as pointers-to-struct compatible
-with data types produced by Clang when compiling C sources. The generation of
-such wrapper functions can additionally be controlled at a function granularity
-by setting the `llvm.emit_c_interface` unit attribute.
-
-More specifically, a memref argument is converted into a pointer-to-struct
-argument of type `{T*, T*, i64, i64[N], i64[N]}*` in the wrapper function, where
-`T` is the converted element type and `N` is the memref rank. This type is
-compatible with that produced by Clang for the following C++ structure template
-instantiations or their equivalents in C.
-
-```cpp
-template<typename T, size_t N>
-struct MemRefDescriptor {
-  T *allocated;
-  T *aligned;
-  intptr_t offset;
-  intptr_t sizes[N];
-  intptr_t strides[N];
-};
-```
-
-Furthermore, we also rewrite function results to pointer parameters if the
-rewritten function result has a struct type. The special result parameter is
-added as the first parameter and is of pointer-to-struct type.
-
-If enabled, the option will do the following. For _external_ functions declared
-in the MLIR module.
-
-1.  Declare a new function `_mlir_ciface_<original name>` where memref arguments
-    are converted to pointer-to-struct and the remaining arguments are converted
-    as usual. Results are converted to a special argument if they are of struct
-    type.
-2.  Add a body to the original function (making it non-external) that
-    1.  allocates memref descriptors,
-    2.  populates them,
-    3.  potentially allocates space for the result struct, and
-    4.  passes the pointers to these into the newly declared interface function,
-        then
-    5.  collects the result of the call (potentially from the result struct),
-        and
-    6.  returns it to the caller.
-
-For (non-external) functions defined in the MLIR module.
-
-1.  Define a new function `_mlir_ciface_<original name>` where memref arguments
-    are converted to pointer-to-struct and the remaining arguments are converted
-    as usual. Results are converted to a special argument if they are of struct
-    type.
-2.  Populate the body of the newly defined function with IR that
-    1.  loads descriptors from pointers;
-    2.  unpacks descriptor into individual non-aggregate values;
-    3.  passes these values into the original function;
-    4.  collects the results of the call and
-    5.  either copies the results into the result struct or returns them to the
-        caller.
-
-Examples:
-
-```mlir
-
-func @qux(%arg0: memref<?x?xf32>)
-
-// Gets converted into the following
-// (using type alias for brevity):
-!llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
-                                     array<2xi64>, array<2xi64>)>
-
-// Function with unpacked arguments.
-llvm.func @qux(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>,
-               %arg2: i64, %arg3: i64, %arg4: i64,
-               %arg5: i64, %arg6: i64) {
-  // Populate memref descriptor (as per calling convention).
-  %0 = llvm.mlir.undef : !llvm.memref_2d
-  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_2d
-  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_2d
-  %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_2d
-  %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_2d
-  %5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm.memref_2d
-  %6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm.memref_2d
-  %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.memref_2d
-
-  // Store the descriptor in a stack-allocated space.
-  %8 = llvm.mlir.constant(1 : index) : i64
-  %9 = llvm.alloca %8 x !llvm.memref_2d
-     : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
-                                        array<2xi64>, array<2xi64>)>>
-  llvm.store %7, %9 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
-                                        array<2xi64>, array<2xi64>)>>
-
-  // Call the interface function.
-  llvm.call @_mlir_ciface_qux(%9)
-     : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
-                          array<2xi64>, array<2xi64>)>>) -> ()
-
-  // The stored descriptor will be freed on return.
-  llvm.return
-}
-
-// Interface function.
-llvm.func @_mlir_ciface_qux(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
-                                              array<2xi64>, array<2xi64>)>>)
-```
-
-```mlir
-func @foo(%arg0: memref<?x?xf32>) {
-  return
-}
-
-// Gets converted into the following
-// (using type alias for brevity):
-!llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
-                                     array<2xi64>, array<2xi64>)>
-!llvm.memref_2d_ptr = type !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
-                                             array<2xi64>, array<2xi64>)>>
-
-// Function with unpacked arguments.
-llvm.func @foo(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>,
-               %arg2: i64, %arg3: i64, %arg4: i64,
-               %arg5: i64, %arg6: i64) {
-  llvm.return
-}
-
-// Interface function callable from C.
-llvm.func @_mlir_ciface_foo(%arg0: !llvm.memref_2d_ptr) {
-  // Load the descriptor.
-  %0 = llvm.load %arg0 : !llvm.memref_2d_ptr
-
-  // Unpack the descriptor as per calling convention.
-  %1 = llvm.extractvalue %0[0] : !llvm.memref_2d
-  %2 = llvm.extractvalue %0[1] : !llvm.memref_2d
-  %3 = llvm.extractvalue %0[2] : !llvm.memref_2d
-  %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_2d
-  %5 = llvm.extractvalue %0[3, 1] : !llvm.memref_2d
-  %6 = llvm.extractvalue %0[4, 0] : !llvm.memref_2d
-  %7 = llvm.extractvalue %0[4, 1] : !llvm.memref_2d
-  llvm.call @foo(%1, %2, %3, %4, %5, %6, %7)
-    : (!llvm.ptr<f32>, !llvm.ptr<f32>, i64, i64, i64,
-       i64, i64) -> ()
-  llvm.return
-}
-```
-
-```mlir
-func @foo(%arg0: memref<?x?xf32>) -> memref<?x?xf32> {
-  return %arg0 : memref<?x?xf32>
-}
-
-// Gets converted into the following
-// (using type alias for brevity):
-!llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
-                                     array<2xi64>, array<2xi64>)>
-!llvm.memref_2d_ptr = type !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
-                                             array<2xi64>, array<2xi64>)>>
-
-// Function with unpacked arguments.
-llvm.func @foo(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>, %arg2: i64,
-               %arg3: i64, %arg4: i64, %arg5: i64, %arg6: i64)
-    -> !llvm.memref_2d {
-  %0 = llvm.mlir.undef : !llvm.memref_2d
-  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_2d
-  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_2d
-  %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_2d
-  %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_2d
-  %5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm.memref_2d
-  %6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm.memref_2d
-  %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.memref_2d
-  llvm.return %7 : !llvm.memref_2d
-}
-
-// Interface function callable from C.
-llvm.func @_mlir_ciface_foo(%arg0: !llvm.memref_2d_ptr, %arg1: !llvm.memref_2d_ptr) {
-  %0 = llvm.load %arg1 : !llvm.memref_2d_ptr
-  %1 = llvm.extractvalue %0[0] : !llvm.memref_2d
-  %2 = llvm.extractvalue %0[1] : !llvm.memref_2d
-  %3 = llvm.extractvalue %0[2] : !llvm.memref_2d
-  %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_2d
-  %5 = llvm.extractvalue %0[3, 1] : !llvm.memref_2d
-  %6 = llvm.extractvalue %0[4, 0] : !llvm.memref_2d
-  %7 = llvm.extractvalue %0[4, 1] : !llvm.memref_2d
-  %8 = llvm.call @foo(%1, %2, %3, %4, %5, %6, %7)
-    : (!llvm.ptr<f32>, !llvm.ptr<f32>, i64, i64, i64, i64, i64) -> !llvm.memref_2d
-  llvm.store %8, %arg0 : !llvm.memref_2d_ptr
-  llvm.return
-}
-```
-
-Rationale: Introducing auxiliary functions for C-compatible interfaces is
-preferred to modifying the calling convention since it will minimize the effect
-of C compatibility on intra-module calls or calls between MLIR-generated
-functions. In particular, when calling external functions from an MLIR module in
-a (parallel) loop, the fact of storing a memref descriptor on stack can lead to
-stack exhaustion and/or concurrent access to the same address. Auxiliary
-interface function serves as an allocation scope in this case. Furthermore, when
-targeting accelerators with separate memory spaces such as GPUs, stack-allocated
-descriptors passed by pointer would have to be transferred to the device memory,
-which introduces significant overhead. In such situations, auxiliary interface
-functions are executed on host and only pass the values through device function
-invocation mechanism.
-
-## Default Memref Model
-
-### Memref Descriptor
-
-Within a converted function, a `memref`-typed value is represented by a memref
-_descriptor_, the type of which is the structure type obtained by converting
-from the memref type. This descriptor holds all the necessary information to
-produce an address of a specific element. In particular, it holds dynamic values
-for static sizes, and they are expected to match at all times.
-
-It is created by the allocation operation and is updated by the conversion
-operations that may change static dimensions into dynamic dimensions and vice
-versa.
-
-**Note**: LLVM IR conversion does not support `memref`s with layouts that are
-not amenable to the strided form.
-
-### Index Linearization
-
-Accesses to a memref element are transformed into an access to an element of the
-buffer pointed to by the descriptor. The position of the element in the buffer
-is calculated by linearizing memref indices in row-major order (lexically first
-index is the slowest varying, similar to C, but accounting for strides). The
-computation of the linear address is emitted as arithmetic operation in the LLVM
-IR dialect. Strides are extracted from the memref descriptor.
-
-Examples:
-
-An access to a memref with indices:
-
-```mlir
-%0 = load %m[%1,%2,%3,%4] : memref<?x?x4x8xf32, offset: ?>
-```
-
-is transformed into the equivalent of the following code:
-
-```mlir
-// Compute the linearized index from strides.
-// When strides or, in absence of explicit strides, the corresponding sizes are
-// dynamic, extract the stride value from the descriptor.
-%stride1 = llvm.extractvalue[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
-                                                   array<4xi64>, array<4xi64>)>
-%addr1 = muli %stride1, %1 : i64
-
-// When the stride or, in absence of explicit strides, the trailing sizes are
-// known statically, this value is used as a constant. The natural value of
-// strides is the product of all sizes following the current dimension.
-%stride2 = llvm.mlir.constant(32 : index) : i64
-%addr2 = muli %stride2, %2 : i64
-%addr3 = addi %addr1, %addr2 : i64
-
-%stride3 = llvm.mlir.constant(8 : index) : i64
-%addr4 = muli %stride3, %3 : i64
-%addr5 = addi %addr3, %addr4 : i64
-
-// Multiplication with the known unit stride can be omitted.
-%addr6 = addi %addr5, %4 : i64
-
-// If the linear offset is known to be zero, it can also be omitted. If it is
-// dynamic, it is extracted from the descriptor.
-%offset = llvm.extractvalue[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
-                                               array<4xi64>, array<4xi64>)>
-%addr7 = addi %addr6, %offset : i64
-
-// All accesses are based on the aligned pointer.
-%aligned = llvm.extractvalue[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
-                                                array<4xi64>, array<4xi64>)>
-
-// Get the address of the data pointer.
-%ptr = llvm.getelementptr %aligned[%addr8]
-     : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<4xi64>, array<4xi64>)>
-     -> !llvm.ptr<f32>
-
-// Perform the actual load.
-%0 = llvm.load %ptr : !llvm.ptr<f32>
-```
-
-For stores, the address computation code is identical and only the actual store
-operation is different.
-
-Note: the conversion does not perform any sort of common subexpression
-elimination when emitting memref accesses.
diff --git a/mlir/docs/TargetLLVMIR.md b/mlir/docs/TargetLLVMIR.md
new file mode 100644
--- /dev/null
+++ b/mlir/docs/TargetLLVMIR.md
@@ -0,0 +1,898 @@
+# LLVM IR Target
+
+This document describes the mechanisms of producing LLVM IR from MLIR. The
+overall flow is two-stage:
+
+1.  **conversion** of the IR to a set of dialects translatable to LLVM IR, for
+    example [LLVM Dialect](Dialects/LLVM.md) or one of the hardware-specific
+    dialects derived from LLVM IR intrinsics such as [AMX](Dialects/AMX.md),
+    [X86Vector](Dialects/X86Vector.md) or [ArmNeon](Dialects/ArmNeon.md);
+2.  **translation** of MLIR dialects to LLVM IR.
+
+This flow allows the non-trivial transformation to be performed within MLIR
+using MLIR APIs and makes the translation between MLIR and LLVM IR *simple* and
+potentially bidirectional. As a corollary, dialect ops translatable to LLVM IR
+are expected to closely match the corresponding LLVM IR instructions and
+intrinsics. This minimizes the dependency on LLVM IR libraries in MLIR as well
+as reduces the churn in case of changes.
+
+SPIR-V to LLVM dialect conversion has a
+[dedicated document](SPIRVToLLVMDialectConversion.md).
+
+[TOC]
+
+## Conversion to the LLVM Dialect
+
+Conversion to the LLVM dialect from other dialects is the first step to produce
+LLVM IR. All non-trivial IR modifications are expected to happen at this stage
+or before. The conversion is *progressive*: most passes convert one dialect to
+the LLVM dialect and keep operations from other dialects intact. For example,
+the `-convert-memref-to-llvm` pass will only convert operations from the
+`memref` dialect but will not convert operations from other dialects even if
+they use or produce `memref`-typed values.
+
+The process relies on the [Dialect Conversion](DialectConversion.md)
+infrastructure and, in particular, on the
+[materialization](DialectConversion.md#type-conversion) hooks of `TypeConverter`
+to support progressive lowering by injecting `unrealized_conversion_cast`
+operations between converted and unconverted operations. After multiple partial
+conversions to the LLVM dialect are performed, the cast operations that became
+noop can be removed by the `-reconcile-unrealized-casts` pass. The latter pass
+is not specific to the LLVM dialect and can remove any noop casts.
+
+### Conversion of Built-in Types
+
+Built-in types have a default conversion to LLVM dialect types provided by the
+`LLVMTypeConverter` class. Users targeting the LLVM dialect can reuse and extend
+this type converter to support other types. Extra care must be taken if the
+conversion rules for built-in types are overridden: all conversion must use the
+same type converter.
+
+#### LLVM Dialect-compatible Types
+
+The types [compatible](Dialects/LLVM.md#built-in-type-compatibility) with the
+LLVM dialect are kept as is.
+
+#### Complex Type
+
+Complex type is converted into an LLVM dialect literal structure type with two
+elements:
+
+-   real part;
+-   imaginary part.
+
+The elemental type is converted recursively using these rules.
+
+Example:
+
+```mlir
+  complex<f32>
+  // ->
+  !llvm.struct<(f32, f32)>
+```
+
+#### Index Type
+
+Index type is converted into an LLVM dialect integer type with the bitwidth
+specified by the [data layout](DataLayout.md) of the closest module. For
+example, on x86-64 CPUs it converts to i64. This behavior can be overridden by
+the type converter configuration, which is often exposed as a pass option by
+conversion passes.
+
+Example:
+
+```mlir
+  index
+  // -> on x86_64
+  i64
+```
+
+#### Ranked MemRef Types
+
+Ranked memref types are converted into an LLVM dialect literal structure type
+that contains the dynamic information associated with the memref object,
+referred to as *descriptor*. Only memrefs in the
+**[strided form](Dialects/Builtin.md/#strided-memref)** can be converted to the
+LLVM dialect with the default descriptor format. Memrefs with other, less
+trivial layouts should be converted into the strided form first, e.g., by
+materializing the non-trivial address remapping due to layout as `affine.apply`
+operations.
+
+The default memref descriptor is a struct with the following fields:
+
+1.  The pointer to the data buffer as allocated, referred to as "allocated
+    pointer". This is only useful for deallocating the memref.
+2.  The pointer to the properly aligned data pointer that the memref indexes,
+    referred to as "aligned pointer".
+3.  A lowered converted `index`-type integer containing the distance in number
+    of elements between the beginning of the (aligned) buffer and the first
+    element to be accessed through the memref, referred to as "offset".
+4.  An array containing as many converted `index`-type integers as the rank of
+    the memref: the array represents the size, in number of elements, of the
+    memref along the given dimension.
+5.  A second array containing as many converted `index`-type integers as the
+    rank of memref: the second array represents the "stride" (in tensor
+    abstraction sense), i.e. the number of consecutive elements of the
+    underlying buffer one needs to jump over to get to the next logically
+    indexed element.
+
+For constant memref dimensions, the corresponding size entry is a constant whose
+runtime value matches the static value. This normalization serves as an ABI for
+the memref type to interoperate with externally linked functions. In the
+particular case of rank `0` memrefs, the size and stride arrays are omitted,
+resulting in a struct containing two pointers + offset.
+
+Examples:
+
+```mlir
+// Assuming index is converted to i64.
+
+memref<f32> -> !llvm.struct<(ptr<f32> , ptr<f32>, i64)>
+memref<1 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64,
+                                 array<1 x 64>, array<1 x i64>)>
+memref<? x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64
+                                 array<1 x 64>, array<1 x i64>)>
+memref<10x42x42x43x123 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64
+                                               array<5 x 64>, array<5 x i64>)>
+memref<10x?x42x?x123 x f32> -> !llvm.struct<(ptr<f32>, ptr<f32>, i64
+                                             array<5 x 64>, array<5 x i64>)>
+
+// Memref types can have vectors as element types
+memref<1x? x vector<4xf32>> -> !llvm.struct<(ptr<vector<4 x f32>>,
+                                             ptr<vector<4 x f32>>, i64,
+                                             array<2 x i64>, array<2 x i64>)>
+```
+
+#### Unranked MemRef Types
+
+Unranked memref types are converted to LLVM dialect literal structure type that
+contains the ynamic information associated with the memref object, referred to
+as *unranked descriptor*. It contains:
+
+1.  a converted `index`-typed integer representing the dynamic rank of the
+    memref;
+2.  a type-erased pointer (`!llvm.ptr<i8>`) to a ranked memref descriptor with
+    the contents listed above.
+
+This descriptor is primarily intended for interfacing with rank-polymorphic
+library functions. The pointer to the ranked memref descriptor points to some
+*allocated* memory, which may reside on stack of the current function or in
+heap. Conversion patterns for operations producing unranked memrefs are expected
+to manage the allocation. Note that this may lead to stack allocations
+(`llvm.alloca`) being performed in a loop and not reclaimed until the end of the
+current function.
+
+#### Function Types
+
+Function types are converted to LLVM dialect function types as follows:
+
+-   function argument and result types are converted recursively using these
+    rules;
+-   if a function type has multiple results, they are wrapped into an LLVM
+    dialect literal structure type since LLVM function types must have exactly
+    one result;
+-   if a function type has no results, the corresponding LLVM dialect function
+    type will have one `!llvm.void` result since LLVM function types must have a
+    result;
+-   function types used in arguments of another function type are wrapped in an
+    LLVM dialect pointer type to comply with LLVM IR expectations;
+-   the structs corresponding to `memref` types, both ranked and unranked,
+    appearing as function arguments are unbundled into individual function
+    arguments to allow for specifying metadata such as aliasing information on
+    individual pointers;
+-   the conversion of `memref`-typed arguments is subject to
+    [calling conventions](TargetLLVMIR.md#calling-conventions).
+
+Examples:
+
+```mlir
+// Zero-ary function type with no results:
+() -> ()
+// is converted to a zero-ary function with `void` result.
+!llvm.func<void ()>
+
+// Unary function with one result:
+(i32) -> (i64)
+// has its argument and result type converted, before creating the LLVM dialect
+// function type.
+!llvm.func<i64 (i32)>
+
+// Binary function with one result:
+(i32, f32) -> (i64)
+// has its arguments handled separately
+!llvm.func<i64 (i32, f32)>
+
+// Binary function with two results:
+(i32, f32) -> (i64, f64)
+// has its result aggregated into a structure type.
+!llvm.func<struct<(i64, f64)> (i32, f32)>
+
+// Function-typed arguments or results in higher-order functions:
+(() -> ()) -> (() -> ())
+// are converted into pointers to functions.
+!llvm.func<ptr<func<void ()>> (ptr<func<void ()>>)>
+
+// These rules apply recursively: a function type taking a function that takes
+// another function
+( ( (i32) -> (i64) ) -> () ) -> ()
+// is converted into a function type taking a pointer-to-function that takes
+// another point-to-function.
+!llvm.func<void (ptr<func<void (ptr<func<i64 (i32)>>)>>)>
+
+// A memref descriptor appearing as function argument:
+(memref<f32>) -> ()
+// gets converted into a list of individual scalar components of a descriptor.
+!llvm.func<void (ptr<f32>, ptr<f32>, i64)>
+
+// The list of arguments is linearized and one can freely mix memref and other
+// types in this list:
+(memref<f32>, f32) -> ()
+// which gets converted into a flat list.
+!llvm.func<void (ptr<f32>, ptr<f32>, i64, f32)>
+
+// For nD ranked memref descriptors:
+(memref<?x?xf32>) -> ()
+// the converted signature will contain 2n+1 `index`-typed integer arguments,
+// offset, n sizes and n strides, per memref argument type.
+!llvm.func<void (ptr<f32>, ptr<f32>, i64, i64, i64, i64, i64)>
+
+// Same rules apply to unranked descriptors:
+(memref<*xf32>) -> ()
+// which get converted into their components.
+!llvm.func<void (i64, ptr<i8>)>
+
+// However, returning a memref from a function is not affected:
+() -> (memref<?xf32>)
+// gets converted to a function returning a descriptor structure.
+!llvm.func<struct<(ptr<f32>, ptr<f32>, i64, array<1xi64>, array<1xi64>)> ()>
+
+// If multiple memref-typed results are returned:
+() -> (memref<f32>, memref<f64>)
+// their descriptor structures are additionally packed into another structure,
+// potentially with other non-memref typed results.
+!llvm.func<struct<(struct<(ptr<f32>, ptr<f32>, i64)>,
+                   struct<(ptr<double>, ptr<double>, i64)>)> ()>
+```
+
+Conversion patterns are available to convert built-in function operations and
+standard call operations targeting those functions using these conversion rules.
+
+#### Multi-dimensional Vector Types
+
+LLVM IR only supports *one-dimensional* vectors, unlike MLIR where vectors can
+be multi-dimensional. Vector types cannot be nested in either IR. In the
+one-dimensional case, MLIR vectors are converted to LLVM IR vectors of the same
+size with element type converted using these conversion rules. In the
+n-dimensional case, MLIR vectors are converted to (n-1)-dimensional array types
+of one-dimensional vectors.
+
+Examples:
+
+```
+vector<4x8 x f32>
+// ->
+!llvm.array<4 x vector<8 x f32>>
+
+memref<2 x vector<4x8 x f32>
+// ->
+!llvm.struct<(ptr<array<4 x vector<8xf32>>>, ptr<array<4 x vector<8xf32>>>
+              i64, array<1 x i64>, array<1 x i64>)>
+```
+
+#### Tensor Types
+
+Tensor types cannot be converted to the LLVM dialect. Operations on tensors must
+be [bufferized](Bufferization.md) before being converted.
+
+### Calling Conventions
+
+Calling conventions provides a mechanism to customize the conversion of function
+and function call operations without changing how individual types are handled
+elsewhere. They are implemented simultaneously by the default type converter and
+by the conversion patterns for the relevant operations.
+
+#### Function Result Packing
+
+In case of multi-result functions, the returned values are inserted into a
+structure-typed value before being returned and extracted from it at the call
+site. This transformation is a part of the conversion and is transparent to the
+defines and uses of the values being returned.
+
+Example:
+
+```mlir
+func @foo(%arg0: i32, %arg1: i64) -> (i32, i64) {
+  return %arg0, %arg1 : i32, i64
+}
+func @bar() {
+  %0 = constant 42 : i32
+  %1 = constant 17 : i64
+  %2:2 = call @foo(%0, %1) : (i32, i64) -> (i32, i64)
+  "use_i32"(%2#0) : (i32) -> ()
+  "use_i64"(%2#1) : (i64) -> ()
+}
+
+// is transformed into
+
+llvm.func @foo(%arg0: i32, %arg1: i64) -> !llvm.struct<(i32, i64)> {
+  // insert the vales into a structure
+  %0 = llvm.mlir.undef : !llvm.struct<(i32, i64)>
+  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.struct<(i32, i64)>
+  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.struct<(i32, i64)>
+
+  // return the structure value
+  llvm.return %2 : !llvm.struct<(i32, i64)>
+}
+llvm.func @bar() {
+  %0 = llvm.mlir.constant(42 : i32) : i32
+  %1 = llvm.mlir.constant(17) : i64
+
+  // call and extract the values from the structure
+  %2 = llvm.call @bar(%0, %1)
+     : (i32, i32) -> !llvm.struct<(i32, i64)>
+  %3 = llvm.extractvalue %2[0] : !llvm.struct<(i32, i64)>
+  %4 = llvm.extractvalue %2[1] : !llvm.struct<(i32, i64)>
+
+  // use as before
+  "use_i32"(%3) : (i32) -> ()
+  "use_i64"(%4) : (i64) -> ()
+}
+```
+
+#### Default Calling Convention for Ranked MemRef
+
+The default calling convention converts `memref`-typed function arguments to
+LLVM dialect literal structs
+[defined above](TargetLLVMIR.md#ranked-memref-types) before unbundling them into
+individual scalar arguments.
+
+Examples:
+
+This convention is implemented in the conversion of `std.func` and `std.call` to
+the LLVM dialect, with the former unpacking the descriptor into a set of
+individual values and the latter packing those values back into a descriptor so
+as to make it transparently usable by other operations. Conversions from other
+dialects should take this convention into account.
+
+This specific convention is motivated by the necessity to specify alignment and
+aliasing attributes on the raw pointers underpinning the memref.
+
+Examples:
+
+```mlir
+func @foo(%arg0: memref<?xf32>) -> () {
+  "use"(%arg0) : (memref<?xf32>) -> ()
+  return
+}
+
+// Gets converted to the following
+// (using type alias for brevity):
+!llvm.memref_1d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
+                                     array<1xi64>, array<1xi64>)>
+
+llvm.func @foo(%arg0: !llvm.ptr<f32>,  // Allocated pointer.
+               %arg1: !llvm.ptr<f32>,  // Aligned pointer.
+               %arg2: i64,             // Offset.
+               %arg3: i64,             // Size in dim 0.
+               %arg4: i64) {           // Stride in dim 0.
+  // Populate memref descriptor structure.
+  %0 = llvm.mlir.undef :
+  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_1d
+  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_1d
+  %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_1d
+  %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_1d
+  %5 = llvm.insertvalue %arg4, %4[4, 0] : !llvm.memref_1d
+
+  // Descriptor is now usable as a single value.
+  "use"(%5) : (!llvm.memref_1d) -> ()
+  llvm.return
+}
+```
+
+```mlir
+func @bar() {
+  %0 = "get"() : () -> (memref<?xf32>)
+  call @foo(%0) : (memref<?xf32>) -> ()
+  return
+}
+
+// Gets converted to the following
+// (using type alias for brevity):
+!llvm.memref_1d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
+                                     array<1xi64>, array<1xi64>)>
+
+llvm.func @bar() {
+  %0 = "get"() : () -> !llvm.memref_1d
+
+  // Unpack the memref descriptor.
+  %1 = llvm.extractvalue %0[0] : !llvm.memref_1d
+  %2 = llvm.extractvalue %0[1] : !llvm.memref_1d
+  %3 = llvm.extractvalue %0[2] : !llvm.memref_1d
+  %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_1d
+  %5 = llvm.extractvalue %0[4, 0] : !llvm.memref_1d
+
+  // Pass individual values to the callee.
+  llvm.call @foo(%1, %2, %3, %4, %5) : (!llvm.memref_1d) -> ()
+  llvm.return
+}
+```
+
+#### Default Calling Convention for Unranked MemRef
+
+For unranked memrefs, the list of function arguments always contains two
+elements, same as the unranked memref descriptor: an integer rank, and a
+type-erased (`!llvm<"i8*">`) pointer to the ranked memref descriptor. Note that
+while the *calling convention* does not require allocation, *casting* to
+unranked memref does since one cannot take an address of an SSA value containing
+the ranked memref, which must be stored in some memory instead. The caller is in
+charge of ensuring the thread safety and management of the allocated memory, in
+particular the deallocation.
+
+Example
+
+```mlir
+llvm.func @foo(%arg0: memref<*xf32>) -> () {
+  "use"(%arg0) : (memref<*xf32>) -> ()
+  return
+}
+
+// Gets converted to the following.
+
+llvm.func @foo(%arg0: i64              // Rank.
+               %arg1: !llvm.ptr<i8>) { // Type-erased pointer to descriptor.
+  // Pack the unranked memref descriptor.
+  %0 = llvm.mlir.undef : !llvm.struct<(i64, ptr<i8>)>
+  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.struct<(i64, ptr<i8>)>
+  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.struct<(i64, ptr<i8>)>
+
+  "use"(%2) : (!llvm.struct<(i64, ptr<i8>)>) -> ()
+  llvm.return
+}
+```
+
+```mlir
+llvm.func @bar() {
+  %0 = "get"() : () -> (memref<*xf32>)
+  call @foo(%0): (memref<*xf32>) -> ()
+  return
+}
+
+// Gets converted to the following.
+
+llvm.func @bar() {
+  %0 = "get"() : () -> (!llvm.struct<(i64, ptr<i8>)>)
+
+  // Unpack the memref descriptor.
+  %1 = llvm.extractvalue %0[0] : !llvm.struct<(i64, ptr<i8>)>
+  %2 = llvm.extractvalue %0[1] : !llvm.struct<(i64, ptr<i8>)>
+
+  // Pass individual values to the callee.
+  llvm.call @foo(%1, %2) : (i64, !llvm.ptr<i8>)
+  llvm.return
+}
+```
+
+**Lifetime.** The second element of the unranked memref descriptor points to
+some memory in which the ranked memref descriptor is stored. By convention, this
+memory is allocated on stack and has the lifetime of the function. (*Note:* due
+to function-length lifetime, creation of multiple unranked memref descriptors,
+e.g., in a loop, may lead to stack overflows.) If an unranked descriptor has to
+be returned from a function, the ranked descriptor it points to is copied into
+dynamically allocated memory, and the pointer in the unranked descriptor is
+updated accordingly. The allocation happens immediately before returning. It is
+the responsibility of the caller to free the dynamically allocated memory. The
+default conversion of `std.call` and `std.call_indirect` copies the ranked
+descriptor to newly allocated memory on the caller's stack. Thus, the convention
+of the ranked memref descriptor pointed to by an unranked memref descriptor
+being stored on stack is respected.
+
+#### Bare Pointer Calling Convention for Ranked MemRef
+
+The "bare pointer" calling convention converts `memref`-typed function arguments
+to a *single* pointer to the aligned data. Note that this does *not* apply to
+uses of `memref` outside of function signatures, the default descriptor
+structures are still used. This convention further restricts the supported cases
+to the following.
+
+-   `memref` types with default layout.
+-   `memref` types with all dimensions statically known.
+-   `memref` values allocated in such a way that the allocated and aligned
+    pointer match. Alternatively, the same function must handle allocation and
+    deallocation since only one pointer is passed to any callee.
+
+Examples:
+
+```
+func @callee(memref<2x4xf32>) {
+
+func @caller(%0 : memref<2x4xf32>) {
+  call @callee(%0) : (memref<2x4xf32>) -> ()
+}
+
+// ->
+
+!descriptor = !llvm.struct<(ptr<f32>, ptr<f32>, i64,
+                            array<2xi64>, array<2xi64>)>
+
+llvm.func @callee(!llvm.ptr<f32>)
+
+llvm.func @caller(%arg0: !llvm.ptr<f32>) {
+  // A descriptor value is defined at the function entry point.
+  %0 = llvm.mlir.undef : !descriptor
+
+  // Both the allocated and aligned pointer are set up to the same value.
+  %1 = llvm.insertelement %arg0, %0[0] : !descriptor
+  %2 = llvm.insertelement %arg0, %1[1] : !descriptor
+
+  // The offset is set up to zero.
+  %3 = llvm.mlir.constant(0 : index) : i64
+  %4 = llvm.insertelement %3, %2[2] : !descriptor
+
+  // The sizes and strides are derived from the statically known values.
+  %5 = llvm.mlir.constant(2 : index) : i64
+  %6 = llvm.mlir.constant(4 : index) : i64
+  %7 = llvm.insertelement %5, %4[3, 0] : !descriptor
+  %8 = llvm.insertelement %6, %7[3, 1] : !descriptor
+  %9 = llvm.mlir.constant(1 : index) : i64
+  %10 = llvm.insertelement %9, %8[4, 0] : !descriptor
+  %11 = llvm.insertelement %10, %9[4, 1] : !descriptor
+
+  // The function call corresponds to extracting the aligned data pointer.
+  %12 = llvm.extractelement %11[1] : !descriptor
+  llvm.call @callee(%12) : (!llvm.ptr<f32>) -> ()
+}
+```
+
+#### Bare Pointer Calling Convention For Unranked MemRef
+
+The "bare pointer" calling convention does not support unranked memrefs as their
+shape cannot be known at compile time.
+
+### C-compatible wrapper emission
+
+In practical cases, it may be desirable to have externally-facing functions with
+a single attribute corresponding to a MemRef argument. When interfacing with
+LLVM IR produced from C, the code needs to respect the corresponding calling
+convention. The conversion to the LLVM dialect provides an option to generate
+wrapper functions that take memref descriptors as pointers-to-struct compatible
+with data types produced by Clang when compiling C sources. The generation of
+such wrapper functions can additionally be controlled at a function granularity
+by setting the `llvm.emit_c_interface` unit attribute.
+
+More specifically, a memref argument is converted into a pointer-to-struct
+argument of type `{T*, T*, i64, i64[N], i64[N]}*` in the wrapper function, where
+`T` is the converted element type and `N` is the memref rank. This type is
+compatible with that produced by Clang for the following C++ structure template
+instantiations or their equivalents in C.
+
+```cpp
+template<typename T, size_t N>
+struct MemRefDescriptor {
+  T *allocated;
+  T *aligned;
+  intptr_t offset;
+  intptr_t sizes[N];
+  intptr_t strides[N];
+};
+```
+
+Furthermore, we also rewrite function results to pointer parameters if the
+rewritten function result has a struct type. The special result parameter is
+added as the first parameter and is of pointer-to-struct type.
+
+If enabled, the option will do the following. For *external* functions declared
+in the MLIR module.
+
+1.  Declare a new function `_mlir_ciface_<original name>` where memref arguments
+    are converted to pointer-to-struct and the remaining arguments are converted
+    as usual. Results are converted to a special argument if they are of struct
+    type.
+2.  Add a body to the original function (making it non-external) that
+    1.  allocates memref descriptors,
+    2.  populates them,
+    3.  potentially allocates space for the result struct, and
+    4.  passes the pointers to these into the newly declared interface function,
+        then
+    5.  collects the result of the call (potentially from the result struct),
+        and
+    6.  returns it to the caller.
+
+For (non-external) functions defined in the MLIR module.
+
+1.  Define a new function `_mlir_ciface_<original name>` where memref arguments
+    are converted to pointer-to-struct and the remaining arguments are converted
+    as usual. Results are converted to a special argument if they are of struct
+    type.
+2.  Populate the body of the newly defined function with IR that
+    1.  loads descriptors from pointers;
+    2.  unpacks descriptor into individual non-aggregate values;
+    3.  passes these values into the original function;
+    4.  collects the results of the call and
+    5.  either copies the results into the result struct or returns them to the
+        caller.
+
+Examples:
+
+```mlir
+
+func @qux(%arg0: memref<?x?xf32>)
+
+// Gets converted into the following
+// (using type alias for brevity):
+!llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
+                                     array<2xi64>, array<2xi64>)>
+
+// Function with unpacked arguments.
+llvm.func @qux(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>,
+               %arg2: i64, %arg3: i64, %arg4: i64,
+               %arg5: i64, %arg6: i64) {
+  // Populate memref descriptor (as per calling convention).
+  %0 = llvm.mlir.undef : !llvm.memref_2d
+  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_2d
+  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_2d
+  %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_2d
+  %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_2d
+  %5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm.memref_2d
+  %6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm.memref_2d
+  %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.memref_2d
+
+  // Store the descriptor in a stack-allocated space.
+  %8 = llvm.mlir.constant(1 : index) : i64
+  %9 = llvm.alloca %8 x !llvm.memref_2d
+     : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
+                                        array<2xi64>, array<2xi64>)>>
+  llvm.store %7, %9 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
+                                        array<2xi64>, array<2xi64>)>>
+
+  // Call the interface function.
+  llvm.call @_mlir_ciface_qux(%9)
+     : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
+                          array<2xi64>, array<2xi64>)>>) -> ()
+
+  // The stored descriptor will be freed on return.
+  llvm.return
+}
+
+// Interface function.
+llvm.func @_mlir_ciface_qux(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
+                                              array<2xi64>, array<2xi64>)>>)
+```
+
+```mlir
+func @foo(%arg0: memref<?x?xf32>) {
+  return
+}
+
+// Gets converted into the following
+// (using type alias for brevity):
+!llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
+                                     array<2xi64>, array<2xi64>)>
+!llvm.memref_2d_ptr = type !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
+                                             array<2xi64>, array<2xi64>)>>
+
+// Function with unpacked arguments.
+llvm.func @foo(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>,
+               %arg2: i64, %arg3: i64, %arg4: i64,
+               %arg5: i64, %arg6: i64) {
+  llvm.return
+}
+
+// Interface function callable from C.
+llvm.func @_mlir_ciface_foo(%arg0: !llvm.memref_2d_ptr) {
+  // Load the descriptor.
+  %0 = llvm.load %arg0 : !llvm.memref_2d_ptr
+
+  // Unpack the descriptor as per calling convention.
+  %1 = llvm.extractvalue %0[0] : !llvm.memref_2d
+  %2 = llvm.extractvalue %0[1] : !llvm.memref_2d
+  %3 = llvm.extractvalue %0[2] : !llvm.memref_2d
+  %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_2d
+  %5 = llvm.extractvalue %0[3, 1] : !llvm.memref_2d
+  %6 = llvm.extractvalue %0[4, 0] : !llvm.memref_2d
+  %7 = llvm.extractvalue %0[4, 1] : !llvm.memref_2d
+  llvm.call @foo(%1, %2, %3, %4, %5, %6, %7)
+    : (!llvm.ptr<f32>, !llvm.ptr<f32>, i64, i64, i64,
+       i64, i64) -> ()
+  llvm.return
+}
+```
+
+```mlir
+func @foo(%arg0: memref<?x?xf32>) -> memref<?x?xf32> {
+  return %arg0 : memref<?x?xf32>
+}
+
+// Gets converted into the following
+// (using type alias for brevity):
+!llvm.memref_2d = type !llvm.struct<(ptr<f32>, ptr<f32>, i64,
+                                     array<2xi64>, array<2xi64>)>
+!llvm.memref_2d_ptr = type !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64,
+                                             array<2xi64>, array<2xi64>)>>
+
+// Function with unpacked arguments.
+llvm.func @foo(%arg0: !llvm.ptr<f32>, %arg1: !llvm.ptr<f32>, %arg2: i64,
+               %arg3: i64, %arg4: i64, %arg5: i64, %arg6: i64)
+    -> !llvm.memref_2d {
+  %0 = llvm.mlir.undef : !llvm.memref_2d
+  %1 = llvm.insertvalue %arg0, %0[0] : !llvm.memref_2d
+  %2 = llvm.insertvalue %arg1, %1[1] : !llvm.memref_2d
+  %3 = llvm.insertvalue %arg2, %2[2] : !llvm.memref_2d
+  %4 = llvm.insertvalue %arg3, %3[3, 0] : !llvm.memref_2d
+  %5 = llvm.insertvalue %arg5, %4[4, 0] : !llvm.memref_2d
+  %6 = llvm.insertvalue %arg4, %5[3, 1] : !llvm.memref_2d
+  %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.memref_2d
+  llvm.return %7 : !llvm.memref_2d
+}
+
+// Interface function callable from C.
+llvm.func @_mlir_ciface_foo(%arg0: !llvm.memref_2d_ptr, %arg1: !llvm.memref_2d_ptr) {
+  %0 = llvm.load %arg1 : !llvm.memref_2d_ptr
+  %1 = llvm.extractvalue %0[0] : !llvm.memref_2d
+  %2 = llvm.extractvalue %0[1] : !llvm.memref_2d
+  %3 = llvm.extractvalue %0[2] : !llvm.memref_2d
+  %4 = llvm.extractvalue %0[3, 0] : !llvm.memref_2d
+  %5 = llvm.extractvalue %0[3, 1] : !llvm.memref_2d
+  %6 = llvm.extractvalue %0[4, 0] : !llvm.memref_2d
+  %7 = llvm.extractvalue %0[4, 1] : !llvm.memref_2d
+  %8 = llvm.call @foo(%1, %2, %3, %4, %5, %6, %7)
+    : (!llvm.ptr<f32>, !llvm.ptr<f32>, i64, i64, i64, i64, i64) -> !llvm.memref_2d
+  llvm.store %8, %arg0 : !llvm.memref_2d_ptr
+  llvm.return
+}
+```
+
+Rationale: Introducing auxiliary functions for C-compatible interfaces is
+preferred to modifying the calling convention since it will minimize the effect
+of C compatibility on intra-module calls or calls between MLIR-generated
+functions. In particular, when calling external functions from an MLIR module in
+a (parallel) loop, the fact of storing a memref descriptor on stack can lead to
+stack exhaustion and/or concurrent access to the same address. Auxiliary
+interface function serves as an allocation scope in this case. Furthermore, when
+targeting accelerators with separate memory spaces such as GPUs, stack-allocated
+descriptors passed by pointer would have to be transferred to the device memory,
+which introduces significant overhead. In such situations, auxiliary interface
+functions are executed on host and only pass the values through device function
+invocation mechanism.
+
+### Address Computation
+
+Accesses to a memref element are transformed into an access to an element of the
+buffer pointed to by the descriptor. The position of the element in the buffer
+is calculated by linearizing memref indices in row-major order (lexically first
+index is the slowest varying, similar to C, but accounting for strides). The
+computation of the linear address is emitted as arithmetic operation in the LLVM
+IR dialect. Strides are extracted from the memref descriptor.
+
+Examples:
+
+An access to a memref with indices:
+
+```mlir
+%0 = load %m[%1,%2,%3,%4] : memref<?x?x4x8xf32, offset: ?>
+```
+
+is transformed into the equivalent of the following code:
+
+```mlir
+// Compute the linearized index from strides.
+// When strides or, in absence of explicit strides, the corresponding sizes are
+// dynamic, extract the stride value from the descriptor.
+%stride1 = llvm.extractvalue[4, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
+                                                   array<4xi64>, array<4xi64>)>
+%addr1 = muli %stride1, %1 : i64
+
+// When the stride or, in absence of explicit strides, the trailing sizes are
+// known statically, this value is used as a constant. The natural value of
+// strides is the product of all sizes following the current dimension.
+%stride2 = llvm.mlir.constant(32 : index) : i64
+%addr2 = muli %stride2, %2 : i64
+%addr3 = addi %addr1, %addr2 : i64
+
+%stride3 = llvm.mlir.constant(8 : index) : i64
+%addr4 = muli %stride3, %3 : i64
+%addr5 = addi %addr3, %addr4 : i64
+
+// Multiplication with the known unit stride can be omitted.
+%addr6 = addi %addr5, %4 : i64
+
+// If the linear offset is known to be zero, it can also be omitted. If it is
+// dynamic, it is extracted from the descriptor.
+%offset = llvm.extractvalue[2] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
+                                               array<4xi64>, array<4xi64>)>
+%addr7 = addi %addr6, %offset : i64
+
+// All accesses are based on the aligned pointer.
+%aligned = llvm.extractvalue[1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64,
+                                                array<4xi64>, array<4xi64>)>
+
+// Get the address of the data pointer.
+%ptr = llvm.getelementptr %aligned[%addr8]
+     : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<4xi64>, array<4xi64>)>
+     -> !llvm.ptr<f32>
+
+// Perform the actual load.
+%0 = llvm.load %ptr : !llvm.ptr<f32>
+```
+
+For stores, the address computation code is identical and only the actual store
+operation is different.
+
+Note: the conversion does not perform any sort of common subexpression
+elimination when emitting memref accesses.
+
+### Utility Classes
+
+Utility classes common to many conversions to the LLVM dialect can be found
+under `lib/Conversion/LLVMCommon`. They include the following.
+
+-   `LLVMConversionTarget` specifies all LLVM dialect operations as legal.
+-   `LLVMTypeConverter` implements the default type conversion as described
+    above.
+-   `ConvertOpToLLVMPattern` extends the conversion pattern class with LLVM
+    dialect-specific functionality.
+-   `VectorConvertOpToLLVMPattern` extends the previous class to automatically
+    unroll operations on higher-dimensional vectors into lists of operations on
+    one-dimensional vectors before.
+-   `StructBuilder` provides a convenient API for building IR that creates or
+    accesses values of LLVM dialect structure types; it is derived by
+    `MemRefDescriptor`, `UrankedMemrefDescriptor` and `ComplexBuilder` for the
+    built-in types convertible to LLVM dialect structure types.
+
+## Translation to LLVM IR
+
+MLIR modules containing `llvm.func`, `llvm.mlir.global` and `llvm.metadata`
+operations can be translated to LLVM IR modules using the following scheme.
+
+-   Module-level globals are translated to LLVM IR global values.
+-   Module-level metadata are translated to LLVM IR metadata, which can be later
+    augmented with additional metadata defined on specific ops.
+-   All functions are declared in the module so that they can be referenced.
+-   Each function is then translated separately and has access to the complete
+    mappings between MLIR and LLVM IR globals, metadata, and functions.
+-   Within a function, blocks are traversed in topological order and translated
+    to LLVM IR basic blocks. In each basic block, PHI nodes are created for each
+    of the block arguments, but not connected to their source blocks.
+-   Within each block, operations are translated in their order. Each operation
+    has access to the same mappings as the function and additionally to the
+    mapping of values between MLIR and LLVM IR, including PHI nodes. Operations
+    with regions are responsible for translated the regions they contain.
+-   After operations in a function are translated, the PHI nodes of blocks in
+    this function are connected to their source values, which are now available.
+
+The translation mechanism provides extension hooks for translating custom
+operations to LLVM IR via a dialect interface `LLVMTranslationDialectInterface`:
+
+-   `convertOperation` translates an operation that belongs to the current
+    dialect to LLVM IR given an `IRBuilderBase` and various mappings;
+-   `amendOperation` performs additional actions on an operation if it contains
+    a dialect attribute that belongs to the current dialect, for example sets up
+    instruction-level metadata.
+
+Dialects containing operations or attributes that want to be translated to LLVM
+IR must provide an implementation of this interface and register it with the
+system. Note that registration may happen without creating the dialect, for
+example, in a separate library to avoid the need for the "main" dialect library
+to depend on LLVM IR libraries. The implementations of these methods may used
+the
+[`ModuleTranslation`](https://mlir.llvm.org/doxygen/classmlir_1_1LLVM_1_1ModuleTranslation.html)
+object provided to them which holds the state of the translation and contains
+numerous utilities.
+
+Note that this extension mechanism is *intentionally restrictive*. LLVM IR has a
+small, relatively stable set of instructions and types that MLIR intends to
+model fully. Therefore, the extension mechanism is provided only for LLVM IR
+constructs that are more often extended -- intrinsics and metadata. The primary
+goal of the extension mechanism is to support sets of intrinsics, for example
+those representing a particular instruction set. The extension mechanism does
+not allow for customizing type or block translation, nor does it support custom
+module-level operations. Such transformations should be performed within MLIR
+and target the corresponding MLIR constructs.
+
+## Translation from LLVM IR
+
+An experimental flow allows one to import a substantially limited subset of LLVM
+IR into MLIR, producing LLVM dialect operations.
+
+```
+  mlir-translate -import-llvm filename.ll
+```