Index: docs/SourceLevelDebugging.rst =================================================================== --- docs/SourceLevelDebugging.rst +++ docs/SourceLevelDebugging.rst @@ -393,6 +393,103 @@ .. _ccxx_frontend: +Object lifetime in optimized code +================================= + +In the example above, every variable assignment uniquely corresponds to a +memory store to the variables position on the stack. However in heavily +optimized code LLVM promotes most variables into SSA registers, which can +eventually be placed in physical registers or memory locations. To track SSA +registers through compilation, when objects are promoted to SSA registers an +``llvm.dbg.value`` intrinsic is created for each assignment, recording the +variables new location. Compared with the ``llvm.dbg.declare`` intrinsic: + +* A dbg.value invalidates the location of any earlier dbg.values for the + specified variable. +* The dbg.value's position in the IR defines where in the instruction stream + the variable location changes. + +Recording this debugging information exposes it to optimization transformations +to a much greater extent. For example: + +* Entire basic blocks may be deleted or merged. +* Redundant value computations may be eliminated. +* Value computations may be hoisted or sunk. + +As a result, optimization passes must take care when dbg.value intrinsics are +altered or moved -- the developer will observe such changes reflected in +variable valuations when debugging the program. For any execution of the +optimized program, the set of variable valuations presented to the developer +by the debugger should not show a state that would never have existed in the +execution of the unoptimized program, given the same input. Doing so risks +misleading the developer by reporting a state that does not exist, damaging +their understanding of the optimized program. + +Sometimes perfectly preserving variable locations is not possible, often +when a redundant calculation is optimized out. In such cases, a dbg.value with +operand ``undef`` should be used, to terminate earlier variable locations and +present ``optimized out`` to the developer. Witholding these potentially stale +variable valuations from the developer diminishes the amount of available debug +information, but increases the reliability of the remaining information. + +To illustrate some potential issues, consider the following example: + +.. code-block:: llvm + + define i32 @foo(i32 %bar, i1 %cond) { + call @llvm.dbg.value(metadata i32 0, metadata !1, metadata !2) + br i1 %cond, label %truebr, label %falsebr + truebr: + %tval = add i32 %bar, 1 + call @dbg.value(metadata i32 %tval, metadata !1, metadata !2) + br label %exit + falsebr: + %fval = add i32 %bar, 2 + call @dbg.value(metadata i32 %fval, metadata !1, metadata !2) + br label %exit + exit: + %merge = phi [ %tval, %truebr ], [ %fval, %falsebr ] + call @dbg.value(metadata i32 %merge, metadata !1, metadata !2) + %toret = add i32 %merge, 10 + call @dbg.value(metadata i32 %toret, metadata !1, metadata !2) + ret i32 %toret + } + +Which, perhaps, could be optimized into the following code: + +.. code-block:: llvm + + define i32 @foo(i32 %bar, i1 %cond) { + call @llvm.dbg.value(metadata i32 0, metadata !1, metadata !2) + %addoper = select i1 %cond, i32 11, i32 12 + call @llvm.dbg.value(metadata i32 undef, metadata !1, metadata !2) + %toret = add i32 %bar, %addoper + call @llvm.dbg.value(metadata i32 %toret, metadata !1, metadata !2) + ret i32 %toret + } + +What dbg.value intrinsics should be placed to represent the original variable +locations in this code? We could certainly place the first dbg.value, of +constant value zero, at the beginning of the function. However if this is the +only dbg.value intrinsic in the function then our debugging information will +indicate that the corresponding variable ``!1`` has a constant value for the +whole function. To avoid this, we must insert a dbg.value wherever the next +assignment to ``!1`` would occur -- in this case in blocks that have been +optimized out. Assuming we cannot recover the value of the assignments in those +blocks, we conservatively insert an ``undef`` dbg.value where they would have +branched to. Finally, we keep the dbg.value for the return value, giving us: + +.. code-block:: llvm + + define i32 @foo(i32 %bar, i1 %cond) { + call @llvm.dbg.value(metadata i32 0, metadata !1, metadata !2) + %addoper = select i1 %cond, i32 11, i32 12 + call @llvm.dbg.value(metadata i32 undef, metadata !1, metadata !2) + %toret = add i32 %bar, %addoper + call @llvm.dbg.value(metadata i32 %toret, metadata !1, metadata !2) + ret i32 %toret + } + C/C++ front-end specific debug information ==========================================