Changeset View
Standalone View
llvm/docs/HowToUpdateDebugInfo.rst
Show All 12 Lines | |||||
This document specifies how to correctly update debug info in various kinds of | This document specifies how to correctly update debug info in various kinds of | ||||
code transformations, and offers suggestions for how to create targeted debug | code transformations, and offers suggestions for how to create targeted debug | ||||
info tests for arbitrary transformations. | info tests for arbitrary transformations. | ||||
For more on the philosophy behind LLVM debugging information, see | For more on the philosophy behind LLVM debugging information, see | ||||
:doc:`SourceLevelDebugging`. | :doc:`SourceLevelDebugging`. | ||||
IR-level transformations | Rules for updating debug locations | ||||
======================== | ================================== | ||||
Deleting an Instruction | .. _WhenToPreserveLocation: | ||||
----------------------- | |||||
When to preserve an instruction location | |||||
---------------------------------------- | |||||
A transformation should preserve the debug location of an instruction if the | |||||
dblaikie: Is this kind of "must" language consistent with the rest of this document? In this case it's an… | |||||
Not Done ReplyInline ActionsThis is aspirational -- would "should" be more appropriate? vsk: This is aspirational -- would "should" be more appropriate? | |||||
Not Done ReplyInline Actions"should" sounds right to me - but open to other folks preferences/ideas about how this should be framed dblaikie: "should" sounds right to me - but open to other folks preferences/ideas about how this should… | |||||
instruction either remains in its basic block, or if its basic block is folded | |||||
into a predecessor that branches unconditionally. The APIs to use are | |||||
``IRBuilder``, or ``Instruction::setDebugLoc``. | |||||
Not Done ReplyInline ActionsDo you see this rule as covering tail duplication? AFAIUI preserving the location when tail duplicating is correct, but involves multiple non-unique predecessors. (Or, I misunderstand unique here). jmorse: Do you see this rule as covering tail duplication? AFAIUI preserving the location when tail… | |||||
Thanks for catching this, I think it should cover tail duplication. I've updated the wording to: '... if its basic block is folded into a predecessor that branches unconditionally'. vsk: Thanks for catching this, I think it should cover tail duplication. I've updated the wording to… | |||||
The purpose of this rule is to ensure that common block-local optimizations | |||||
preserve the ability to set breakpoints on source locations corresponding to | |||||
s/source lines/source locations/ That's also what they are called in Clang. aprantl: s/source lines/source locations/
That's also what they are called in Clang. | |||||
the instructions they touch. Debugging, crash logs, and SamplePGO accuracy | |||||
Debugging, crash logs, and SamplePGO accuracy, ... aprantl: `Debugging, crash logs, and SamplePGO accuracy, ...` | |||||
would be severely impacted if that ability were lost. | |||||
Examples of transformations that should follow this rule include: | |||||
* Instruction scheduling. Block-local instruction reordering should not drop | |||||
s/line/source/ aprantl: s/line/source/ | |||||
source locations, even though this may lead to jumpy single-stepping | |||||
behavior. | |||||
* Simple jump threading. For example, if block ``B1`` unconditionally jumps to | |||||
``B2``, *and* is its unique predecessor, instructions from ``B2`` can be | |||||
Source aprantl: Source | |||||
hoisted into ``B1``. Source locations from ``B2`` should be preserved. | |||||
Simple peephole optimizations that replace or expand an instruction, like aprantl: Simple peephole optimizations that replace or expand an instruction, like | |||||
* Peephole optimizations that replace or expand an instruction, like ``(add X | |||||
I would either reuse + and << or use add and shl in the previous line aprantl: I would either reuse + and << or use add and shl in the previous line | |||||
X) => (shl X 1)``. The location of the ``shl`` instruction should be the same | |||||
as the location of the ``add`` instruction. | |||||
* Tail duplication. For example, if blocks ``B1`` and ``B2`` both | |||||
unconditionally branch to ``B3`` and ``B3`` can be folded into its | |||||
s/breakpoint/source/ aprantl: s/breakpoint/source/ | |||||
predecessors, source locations from ``B3`` should be preserved. | |||||
Examples of transformations for which this rule *does not* apply include: | |||||
* LICM. E.g., if an instruction is moved from the loop body to the preheader, | |||||
the rule for :ref:`dropping locations<WhenToDropLocation>` applies. | |||||
.. _WhenToMergeLocation: | |||||
When to merge instruction locations | |||||
Not Done ReplyInline ActionsShould this be "or" rather than "and"? If you merge two instructions even in the same BB, I think the goal is you should use a merged location, right? dblaikie: Should this be "or" rather than "and"? If you merge two instructions even in the same BB, I… | |||||
I'm of two minds on this one. The applicable (hypothetical) example here is "(A * B) + C => llvm.fma.f32(A, B, C)". If the part about "being in a different block" is kept, then the location of the "fma" instruction should be determined by location cascade. That location has a realistic shot at mapping back to some real (if somewhat arbitrary) source line. If get rid of the "being in a different block" language, the location of the "fma" instruction should be artificial (line 0). It seems like both options suit the interactive debugging and crash triage use cases. But maybe location cascade would be a better fit for SamplePGO, if that needs to find a "real" line number to increment the right basic block execution count? It'd be helpful for anyone familiar with SamplePGO to express some opinion (cc @danielcdh). vsk: I'm of two minds on this one. The applicable (hypothetical) example here is "(A * B) + C =>… | |||||
Not Done ReplyInline ActionsNot sure I follow the fma example. Let me rephrase the words in the document to see if we're both understanding them the same way: "IF multiple instructions are merged and the merged location is in a different block than one of the inputs - you /should/ (must, whatever) merge" - the implication is then if you have multiple instructions that don't change blocks, the location should /not/ be merged? That doesn't sound like what I'd expect - in the fma example, if the * and - were from the same basic block, I'd still expect to merge the location rather than dropping or zeroing the location. So that scope information was preserved (eg: if * and - came from the same inline function, then the fma should still be attributed to that inline function). dblaikie: Not sure I follow the fma example.
Let me rephrase the words in the document to see if we're… | |||||
Not Done ReplyInline ActionsRe: "the implication is then if you have multiple instructions that don't change blocks, the location should /not/ be merged" -- yep, we're on the same page about the proposed rule. Your point about expecting the right inline scope on merged instructions is compelling though. I'll fix this up. Part of my confusion here was over how exactly SamplePGO works: can it correctly attribute a sample of a line 0 location to the scope it points to? Maybe that's moot, though, since relying on location cascade to assign locations to merged instructions could be more misleading than helpful. vsk: Re: "the implication is then if you have multiple instructions that don't change blocks, the… | |||||
Not Done ReplyInline Actions
I don't actually know how it works.
In theory I think such samples could be used, but fairly likely they aren't used. (you could imagine a sample based profiler could look at all source lines within a sampled basic block - and mark all those source lines as sampled) dblaikie: > Part of my confusion here was over how exactly SamplePGO works:
I don't actually know how… | |||||
Not Done ReplyInline ActionsI believe that (at least how we use it in Sony) SPGO doesn't use a line-0 location, but snoops around looking for a real source location in the same block. Jeremy will poke a couple people who actually worked on it to try to get a more definitive answer tomorrow (they're in the UK). probinson: I believe that (at least how we use it in Sony) SPGO doesn't use a line-0 location, but snoops… | |||||
Not Done ReplyInline ActionsWhen sampling an executable, the "hits" are addresses which must be mapped back to source locations using debug line info (i.e. addr2line). Any transform that moves or merges instructions without updating the debug location can therefore affect the quality of the sample. The standard example is tail-merging (which in fact was the first optimisation we fixed). Imagine we have an if-then-else, where the "then" and "else" blocks end with the same instruction(s). Tail-merge will common the instructions in the successor block. For block-reordering, we want to know if one side of the if-then-else is executed much more than the other. But if tail-merge has retained the original debug locations in the merged tail we will get an inaccurate sample. If the "then" location is used, hits on the common instructions will be attributed to the "then" block which is wrong. Likewise if the "else" locations were used. The location of the tail is ambiguous. To fix this we introduced the "getMergedLocation" call. The initial version was very simple. If the locations were the same, either can be used. But if the locations are different, we can't give a location and an empty ("unknown") location is used. At some point, this "unknown" location ends up generating a DWARF line-0 record. Now this is the bit I didn't directly work on so I don't know where it occurs. My memory is that the work was done by Paul, and that an "unknown" location at the start of a basic block forced a line-0. This did exactly what we wanted, and we rewrote our initial patch to tail-merge that explicitly created a line-0 location. rob.lougher: When sampling an executable, the "hits" are addresses which must be mapped back to source… | |||||
Not Done ReplyInline ActionsP.S. We did a presentation of this work at EuroLLVM 2017 (https://www.youtube.com/watch?v=ceCEXnuWdmo) which you can watch for more examples. rob.lougher: P.S. We did a presentation of this work at EuroLLVM 2017 (https://www.youtube.com/watch? | |||||
Not Done ReplyInline ActionsRob, I think the most relevant question is: If there's a sample hit on an address that maps to line 0, what does SPGO do with it? Throw it away? Rummage around for a non-zero location? probinson: Rob, I think the most relevant question is: If there's a sample hit on an address that maps to… | |||||
Not Done ReplyInline ActionsThis is handled by AutoFDO (which reads a "raw" sample and converts it into a sample with lines relative to the start of the function). It's 4 years since I hacked that code so I wanted to check to make sure. Anyway, the answer is that samples for addresses that map to line 0 are simply thrown away... rob.lougher: This is handled by AutoFDO (which reads a "raw" sample and converts it into a sample with lines… | |||||
@probinson @rob.lougher thanks for the context on SamplePGO. I'm going to propose switching the language here to recommend merging instruction locations, because the alternative of relying on location cascade doesn't seem like it's better for SamplePGO purposes. vsk: @probinson @rob.lougher thanks for the context on SamplePGO. I'm going to propose switching the… | |||||
----------------------------------- | |||||
A transformation should merge instruction locations if it replaces multiple | |||||
instructions with a single merged instruction, *and* that merged instruction | |||||
does not correspond to any of the original instructions' locations. The API to | |||||
use is ``Instruction::applyMergedLocation``. | |||||
The purpose of this rule is to ensure that a) the single merged instruction | |||||
has a location with an accurate scope attached, and b) to prevent misleading | |||||
single-stepping (or breakpoint) behavior. Often, merged instructions are memory | |||||
accesses which can trap: having an accurate scope attached greatly assists in | |||||
crash triage by identifying the (possibly inlined) function where the bad | |||||
memory access occurred. This rule is also meant to assist SamplePGO by banning | |||||
scenarios in which a sample of a block containing a merged instruction is | |||||
misattributed to a block containing one of the instructions-to-be-merged. | |||||
Examples of transformations that should follow this rule include: | |||||
* Merging identical loads/stores which occur on both sides of a CFG diamond | |||||
(see the ``MergedLoadStoreMotion`` pass). | |||||
* Merging identical loop-invariant stores (see the LICM utility | |||||
``llvm::promoteLoopAccessesToScalars``). | |||||
Not Done ReplyInline ActionsBit more clarification here? echristo: Bit more clarification here? | |||||
Not Done ReplyInline ActionsHow about: "Converting an if-then-else CFG diamond into a select. Preserving the debug locations of speculated instructions can make it seem like a condition is true when it's not (or vice versa), which can lead to a confusing single-stepping experience. The rule for :ref:dropping locations<WhenToDropLocation> should apply here." ? vsk: How about: "Converting an if-then-else CFG diamond into a select. Preserving the debug… | |||||
I wonder if we should drop the Simple/Complex adjectives altogether? aprantl: I wonder if we should drop the Simple/Complex adjectives altogether? | |||||
* Peephole optimizations which combine multiple instructions together, like | |||||
``(add (mul A B) C) => llvm.fma.f32(A, B, C)``. Note that the location of | |||||
the ``fma`` does not exactly correspond to the locations of either the | |||||
``mul`` or the ``add`` instructions. | |||||
Examples of transformations for which this rule *does not* apply include: | |||||
* Block-local peepholes which delete redundant instructions, like | |||||
``(sext (zext i8 %x to i16) to i32) => (zext i8 %x to i32)``. The inner | |||||
Not Done ReplyInline ActionsI would be curious about an explanation what makes this example different from the fma example above. aprantl: I would be curious about an explanation what makes this example different from the fma example… | |||||
Not Done ReplyInline ActionsThanks for pressing on this, it's not terribly clear as-written. How about: "Block-local peepholes which delete redundant instructions, like `(sext (zext i8 %x to i16) to i32) => (zext i8 %x to i32). The inner zext` is modified but remains in its block, so the rule for preserving locations should apply." vsk: Thanks for pressing on this, it's not terribly clear as-written. How about: "Block-local… | |||||
Not Done ReplyInline Actionssounds good! aprantl: sounds good! | |||||
``zext`` is modified but remains in its block, so the rule for | |||||
:ref:`preserving locations<WhenToPreserveLocation>` should apply. | |||||
* Converting an if-then-else CFG diamond into a ``select``. Preserving the | |||||
debug locations of speculated instructions can make it seem like a condition | |||||
is true when it's not (or vice versa), which leads to a confusing | |||||
single-stepping experience. The rule for | |||||
:ref:`dropping locations<WhenToDropLocation>` should apply here. | |||||
* Hoisting identical instructions which appear in several successor blocks into | |||||
a predecessor block (see ``BranchFolder::HoistCommonCodeInSuccs``). In this | |||||
case there is no single merged instruction. The rule for | |||||
:ref:`dropping locations<WhenToDropLocation>` applies. | |||||
.. _WhenToDropLocation: | |||||
When to drop an instruction location | |||||
------------------------------------ | |||||
A transformation should drop debug locations if the rules for | |||||
:ref:`preserving<WhenToPreserveLocation>` and | |||||
:ref:`merging<WhenToMergeLocation>` debug locations do not apply. The API to | |||||
use is ``Instruction::setDebugLoc()``. | |||||
The purpose of this rule is to prevent erratic or misleading single-stepping | |||||
behavior in situations in which an instruction has no clear, unambiguous | |||||
Not Done ReplyInline ActionsMight be worth a separate header - not sure how consistently we do this or how problematic it is if this isn't done. dblaikie: Might be worth a separate header - not sure how consistently we do this or how problematic it… | |||||
Not Done ReplyInline ActionsThat's a good point, e.g. I don't think the ASan pass bothers with setting line 0 locations. I'll try to split this up. vsk: That's a good point, e.g. I don't think the ASan pass bothers with setting line 0 locations. | |||||
relationship to a source location. | |||||
To handle an instruction without a location, the DWARF generator | |||||
defaults to allowing the last-set location after a label to cascade forward, or | |||||
to setting a line 0 location with viable scope information if no previous | |||||
location is available. | |||||
See the discussion in the section about | |||||
:ref:`merging locations<WhenToMergeLocation>` for examples of when the rule for | |||||
dropping locations applies. | |||||
Rules for updating debug values | |||||
=============================== | |||||
Deleting an IR-level Instruction | |||||
-------------------------------- | |||||
When an ``Instruction`` is deleted, its debug uses change to ``undef``. This is | When an ``Instruction`` is deleted, its debug uses change to ``undef``. This is | ||||
a loss of debug info: the value of a one or more source variables becomes | a loss of debug info: the value of a one or more source variables becomes | ||||
unavailable, starting with the ``llvm.dbg.value(undef, ...)``. When there is no | unavailable, starting with the ``llvm.dbg.value(undef, ...)``. When there is no | ||||
way to reconstitute the value of the lost instruction, this is the best | way to reconstitute the value of the lost instruction, this is the best | ||||
possible outcome. However, it's often possible to do better: | possible outcome. However, it's often possible to do better: | ||||
* If the dying instruction can be RAUW'd, do so. The | * If the dying instruction can be RAUW'd, do so. The | ||||
▲ Show 20 Lines • Show All 277 Lines • Show Last 20 Lines |
Is this kind of "must" language consistent with the rest of this document? In this case it's an aspirational, sort of "this is ideal/what we're striving for" rather than "if you don't do this it'll fail the verifier/break", etc?