diff --git a/clang/docs/DataFlowSanitizerDesign.rst b/clang/docs/DataFlowSanitizerDesign.rst --- a/clang/docs/DataFlowSanitizerDesign.rst +++ b/clang/docs/DataFlowSanitizerDesign.rst @@ -135,6 +135,30 @@ track of what labels they have used so far, picking one that is yet unused, etc). +Origin tracking trace representation +------------------------------------ + +Every four 4-bytes aligned application bytes share a 4-byte origin value. A +4-byte origin contains a 4-bit depth and a 28-bit hash ID of a chain. + +A chain ID is calculated as a hash from a chain structure. A chain structure +contains a stack ID and the previous chain ID. The chain head has a zero +previous chain ID. A stack ID is a hash from a stack trace. The 4-bit depth +limits the maximal length of a path. The environment variable ``origin_history_size`` +can set the depth limit. Non-positive values mean unlimited. Its default value +is 16. When reaching the limit, origin tracking ignores following propagation +chains. + +A chain starts by `dfsan_set_label` with non-zero labels. A new chain is added +at stores or memory-transfer when ``-dfsan-track-origins`` is 1. Memory transfers +include LLVM memory transfer instructions and wrapped glibc memcpy and memmove. +When ``-dfsan-track-origins`` is 2, a new chain is also added at loads. + +Other instructions do not create new chains, but simply propagate origin values. +If an instruction has more than one operands with non-zero labels, the origin +value of the last operand with non-zero label is propagated to the result of +this instruction. + Memory layout and label management ----------------------------------