At the last LLVM dev meeting we had a debug info for optimized code BoF session. In that session I presented some graphs that showed how the quality of the debug info produced by LLVM changed over the last couple of years. This is a cleaned up version of the patch I used to collect the this data. It is implemented as an extension of llvm-dwarfdump, adding a new --statistics option. The intended use-case is to automatically run this on the debug info produced by, e.g., our bots, to identify eyebrow-raising changes or regressions introduced by new transformations that we could act on.
In the current form, two kinds of data are being collected:
- the number of variables that have a debug location versus the number of variables in total (this takes into account inlined instances of the same function, so if a variable is completely missing form only one instance it will be found)
- the PC range covered by variable location descriptions versus the PC range of all variables' containing lexical scopes
The output format is versioned and extensible, so I'm looking forward to both bug fixes and ideas for other data that would be interesting to track.