Index: llvm/utils/TableGen/DecoderEmitter.md
===================================================================
--- /dev/null
+++ llvm/utils/TableGen/DecoderEmitter.md
@@ -0,0 +1,225 @@
+# Decoder emitter
+
+The decoder emitter generates the code for the disassembler of several architectures.
+
+The general design is split into three classes.
+
+- `DecoderEmitter`: Controls code generation and pre-processes `Records`.
+- `FilterChooser`: Builds the decoding state machine.
+- `Filter`: Represents a state in the decoding state machine.
+
+The rough process of the code generation is described in the following diagram.
+The details about the state machine creation (the `recurse` step) are explained below.
+
+```
+DecoderEmitter FilterChooser Filter Output Stream
+
+ ┌───┐ ┌───┐ ┌───┐ ┌───┐
+ │ ├────────┐ │ │ │ │ │ │
+ │ │ │Separate instr. │ │ │ │ │ │
+ │ │ │into groups │ │ │ │ │ │
+ │ │ │ │ │ │ │ │ │
+ │ │◄───────┘ │ │ │ │ │ │
+ │ │ │ │ │ │ │ │
+ │ │ Start separation process │ │ Separate │ │ │ │
+ │ │ of instr. groups │ │ into │ │ │ │
+ │ ├──────────────────────────►│ │ subsets │ │ │ │
+ │ │ │ ├───────────┐ │ │ │ │
+ │ │ │ │ │ Add Filter to │ │ │ │
+ │ │ │ │ │ Filterlist │ │ │ │
+ │ │ │ │ R ├──────────────────►│ │ │ │
+ │ │ │ │ E │ │ │Emit state │ │
+ │ │ │ │ C │ │ ├──────────────────► │ │
+ │ │ │ │ U │ └───┘ │ │
+ │ │ │ │ R │ │ │
+ │ │ │ │ S │ │ │
+ │ │ │ │ E │ │ │
+ │ │ │ │ │ │ │
+ │ │ │ │ │ │ │
+ │ │ │ │ │ │ │
+ │ │ │ │◄──────────┘ │ │
+ │ │ │ │ │ │
+ │ │ │ │ │ │
+ │ │◄──────────────────────────┤ │ │ │
+ │ │ Return list of decoder │ │ │ │
+ │ │ syntax └───┘ │ │
+ │ │ │ │
+ │ │ │ │
+ │ │ │ │
+ │ │ │ │
+ │ │ Emit decoder logic │ │
+ │ ├──────────────────────────────────────────────────────────────────────────────────────► │ │
+ │ │ │ │
+ └───┘ └───┘
+```
+
+# Instruction Decoding
+
+The disassembler of LLVM decodes instructions with the help of a non-cyclic [state machine](https://en.wikipedia.org/wiki/Finite-state_machine).
+
+Because we describe here how this state machine is generated.
+
+## The general idea
+
+We start with a single set off all instructions of an architecture.
+These instructions are put in groups. Each group holds the instructions of a specific CPU extension or hardware mode
+(think of `Thumb` mode in ARM, or vector extension of some processors).
+For each group a state machine is generated. It decodes instructions of this group.
+
+To generate the state machine we first take all instructions of a group.
+This set is split into subsets each of which can be distinguished to the other subsets by a certain bit field in its encoding.
+This and other separation information is called `Filter`.
+
+The subsets are further split into smaller subsets until only single instructions are left.
+
+_Each subset represents a state in our decoding state machine._
+
+
+
+This generator will build the states for several instruction groups (the `uint8_t decoder[]` tables)
+and a decoder function which walks over those states until it fails or identifies an instruction.
+
+## Step 1 - Determine best bits for set separation
+
+In this step we determine which bits are most suited to separate the current set of instructions into subsets.
+
+Lets assume we have four instructions with a width of 8 bits (all instructions of a group have the same bit width).
+
+```text
+Bit 0 1 2 3 4 5 6 7
+
+IA: 0 1 0 ? ? 1 ? ?
+IB: ? 1 0 0 0 ? ? ?
+IC: 0 0 0 0 1 ? ? ?
+ID: ? 0 1 ? ? ? ? ?
+
+`?` = unset bit position (holds variable bits)
+```
+
+Now we have a `BitAttr` mask which saves which bits are suitable for filtering the set into subsets.
+
+```
+BitAttr: . . _ _ _ _ _ _
+
+"." = Bit already used by previous Filter.
+"_" = Bit unset/None (not used by previous Filter)
+```
+
+In the beginning all bits are unset (`_`) but the more instructions become filtered the more bits are set to `.`.
+
+To determine which bits in instruction `IA` to `ID` can be used to distinguish them, we define a automaton (see below).
+The automaton gets the value at `BitAttr[i]` (`Filtered` or `Unset`/`None`) as start state and as input the bits at `IA[i]`, `IB[i]`, `IC[i]`, `ID[i]`.
+
+The result can be: `AllSet` (`S`), `All_unset` (`U`), `Mixed` (`M`) or `Filtered` (`F`)
+The result is written to `BitAttr[i]`.
+
+```
+ 0,1
+ ┌─────┐
+ │ │
+ │ │
+ │ │
+ │ ▼
+ ┌─┴───────┐
+ 0,1 │ │ 0,1,?
+ ┌─────────────────►│All_Set │ ┌────┐
+ │ │ │ │ │
+ │ └────┬────┘ │ │
+ │ │ │ │
+ │ │? │ ▼
+ │ │ ┌───┴──────┐
+ │ ▼ │ │
+┌────┴───┐ ┌───────────┐ │ Filtered │
+│ │ │ ├─────┐ │ │
+│ None │ │ Mixed │ │0,1,? └──────────┘
+│ │ │ │◄────┘
+└───┬────┘ └───────────┘
+ │ ▲
+ │ │0,1
+ │ │
+ │ ┌────┴─────┐
+ │ ? │ │
+ └──────────────────►│ All_unset│
+ │ │
+ └─┬────────┘
+ │ ▲
+ │ │
+ │ │
+ │ │
+ └─────┘
+ ?
+```
+
+Here is how our little example would play out.
+
+```
+IA: 0 1 0 ? ? 1 ? ?
+IB: ? 1 0 0 0 ? ? ?
+IC: 0 0 0 0 1 ? ? ?
+ID: ? 0 1 ? ? ? ? ?
+
+BitAttr: . . _ _ _ _ _ _
+
+becomes
+
+BitAttr: . . S M M M U U
+```
+
+## Step 2 - Determine potential Filters
+
+Now we report regions in the `BitAttr` which might be suitable as a `Filter`.
+A `region` is the index of the start bit and the length.
+
+There are three region reporting strategies:
+
+1. Report only successive `S` bits.
+2. Report only successive `S` or successive `M` bits.
+3. A special and rare case (explained in code).
+
+The strategies are tried from 1 to 3.
+If non of them works we get a conflict (the set of instructions are indistinguishable). This case is explained in the code.
+
+If we continue with our example we get the following regions:
+
+```
+region = (startIndex, length)
+
+Bit 0 1 2 3 4 5 6 7
+BitAttr: . . S M M M U U
+
+strategy 1 = R1 = (2, 1)
+strategy 2 = R1 = (2, 1), R2 = (3, 3)
+```
+
+## Step 3 - Select the best region as Filter
+
+We determine the best `Filter` by checking which region distinguishes the most instructions.
+Lets assume we used strategy 2 here, so we have:
+
+```
+Bit 0 1 2 3 4 5 6 7
+
+IA: 0 1 0 ? ? 1 ? ?
+IB: ? 1 0 0 0 ? ? ?
+IC: 0 0 0 0 1 ? ? ?
+ID: ? 0 1 ? ? ? ? ?
+
+R1 = (2, 1)
+R2 = (3, 3)
+```
+
+`R1` can create two subsets: `(IA, IB, IC)` and `(ID)`.
+`R2` can create only one subset `(IA, IB, IC, ID)`.
+
+`R1` can separate instructions by bit 2 (which is either `0` or `1` in all instructions).
+
+`R2` has not a single bit position where all instruction bits are known (remember: `?` are variable bits, we can't distinguish them).
+Therefore it can not separate instructions.
+
+Because `R1` creates the most subsets it is chosen as the Filter.
+
+## Recurse
+
+The `BitAttr` bits are reset to either `Filtered` (if used) or `Unset` and passed on to the inferior `FilterChooser` of each subset.
+For each subset we start the process again from `Step 1` (not for subsets of size 1 obviously).
+