This is an archive of the discontinued LLVM Phabricator instance.

[llvm-symbolizer] Add --approximate-missing-line-numbers Command Line Option
Needs ReviewPublic

Authored by gbreynoo on Feb 28 2022, 6:56 AM.

Details

Summary

There are cases in which line information is missing, for example when the compiler gives no line number entry due to optimizations making a line number nebulous. When using llvm-symbolizer to debug these cases the user has to test the nearby addresses to find a valid source line number. This change adds the command line option --approximate-missing-line-numbers, in cases in which no line number is found the line number of the previous address will be output instead.

Diff Detail

Event Timeline

gbreynoo created this revision.Feb 28 2022, 6:56 AM
gbreynoo requested review of this revision.Feb 28 2022, 6:56 AM
Herald added a project: Restricted Project. · View Herald Transcript

To be clear this new output is an approximation for cases in which the canonical line number is 0 and so not too helpful to the user. I wanted to suggest this functionality as we have seen cases in which this new option would be useful to a user. Suggestions for a better name than --approximate-missing-line-numbers would also be appreciated.

jhenderson added a subscriber: dblaikie.

I've already expressed my doubts offline about this change potentially causing people to end up with misleading output. However, in and of itself, I don't think that's a reason to reject it, since the user has to opt-in, although I'd like to hear from some other opinions (@dblaikie?). However, I do think it is important for a user to be able to distinguish between an approximated address and a "definitely right" address, by the output being annotated somehow with something like "(approximate)".

llvm/docs/CommandGuide/llvm-symbolizer.rst
209–211
llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
147

Nit: no blank lines at start of functions.

llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
126–127

This is unrelated to the demangling block above, so split it up.

154

Wrong function name style.

160–164

If I'm following this correctly, this ends up in a recursive loop until you find a non-zero line number or address zero? It sounds honestly rather inefficient to me, in the event of a compiler emitting a large amount of code with line 0. That being said, I don't know how often such a situation could occur. I would at least like to see a test case for multiple consecutive line 0 entries.

llvm/test/tools/llvm-symbolizer/approximate-missing-line-numbers-inline.s
7–8

I might be being dumb: is this test case supposed to be showing that --approximate-missing-line-numbers does nothing if the address is non-zero? That's fine (we should have that case), but if so, I a) don't see a non-inlined case where the option does anything (the address 16 case looks like it is the same essential case as this address 6 case).

llvm/test/tools/llvm-symbolizer/approximate-missing-line-numbers.s
1

Is there any real difference between inlined and non-inline cases? The code change is under the symbolizeCodeCommon function after all.

8–12

I'm not sure output style is at all relevant to this option?

llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
371

Nit: clang-format.

Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2022, 10:49 PM
MaskRay added a comment.EditedMar 2 2022, 10:54 AM

Suppose we are going to add such an interface:

@jhenderson: If I'm following this correctly, this ends up in a recursive loop until you find a non-zero line number or address zero? ...

If we want to avoid the loop iteratively calling SymbolizableObjectFile::symbolizeCode: getModuleSectionIndexForAddress, DebugInfoContext->getLineInfoForAddress, and getNameFromSymbolTable will need to be taught to skip line==0 lines.
Changing getNameFromSymbolTable may be straightforward (it is basically a llvm::upper_bound call), but need to expose a new interface. Changes to the other two functions will be tricky.

Otherwise, a loop is perhaps good enough. If we want to guard against a pessimistic case that calls symbolizeCode too many times (for example, try symbolizing 0x2000 but only 0x1000 has a line number; there will be 0x1000 failed trials which can be too slow), we may need to teach the above functions to stop, or set an arbitrary trial limit. Note that the trial limit can be encoded as --approximate-missing-line-numbers=<value>, so the user will be partially responsible for making llvm-symbolizer too slow in the pessimistic cases :)

Yeah, I'm not sure this is a great way to go - we do have info in the symbolizer for where the start of the function is, is that sufficient for the use cases you've come across?

And if we're going to give an approximate answer, it shouldn't be based on the probing solution implemented here - but probably lower-level/based on walking the line table to find the nearest non-zero location instead.

The decrementing solution came up because in the cases where we see this in the wild, it's symbolizing a backtrace (or possibly just one return address) and the return address naturally doesn't point to the call instruction, it points one past the call instruction; the instruction after a call might have line 0 for various reasons, the most blatant being that it's a noreturn call. But the call instruction itself nearly always will not have line 0.
So, _in practice_, we'll basically decrement once and be done.

Obviously there can be pathological cases and we should avoid those. But a "try decrementing once only" solution seems like it would solve the practical problem, and avoid the pathological cases.

Incidentally I agree with @jhenderson that there should be some indication in the output that a result is approximate.

The decrementing solution came up because in the cases where we see this in the wild, it's symbolizing a backtrace (or possibly just one return address) and the return address naturally doesn't point to the call instruction, it points one past the call instruction; the instruction after a call might have line 0 for various reasons, the most blatant being that it's a noreturn call. But the call instruction itself nearly always will not have line 0.

Wait, though, that seems like a very different problem from the one described - that seems like an issue of trying to symbolize the wrong address (well, I thought stack trace reports would subtract one from addresses, but it doesn't seem lke LLVM's crash handler does that - so it does give the address/symbolized location of the return address, not the point of the call - I've certainly had discussions with folks internally who get confused by that - but maybe changing it would produce more churn/confusion, even though it's I think what people are more likely to expect in a back trace ("where am I now" not "where will I be when each of these functions return"))

So, _in practice_, we'll basically decrement once and be done.

Obviously there can be pathological cases and we should avoid those. But a "try decrementing once only" solution seems like it would solve the practical problem, and avoid the pathological cases.

I'm a bit less sure of this change/direction more generally, given this framing/motivation - like maybe decrementing once before even symbolizing would be better/more reliable/consistent in this use case? (but yeah, maybe more confusing for folks used to/more aware of the nuance of stack traces being where the program is returning to, rather than where it is currently - maybe there are further complications with tail calls, etc, which mean decrementing doesn't actually get you to the call site that reached the callee)

Still, if we're going this way, I'm not super happy with the probing approach - I /think/ it should probably be implemented at some lower level where we can walk backwards through the line table. Though, yeah, maybe that turns out to be silly complicated (especially when dealing with both the line table and the inline info from the debug_info DIEs too)

I think the encoding of the line table may make it infeasible to walk backwards. But perhaps the code that does interpret the line table could keep track of the most-recent-nonzero-line and return that in addition to (or instead of) zero, when the requested address does resolve to zero? If it's instead-of, the API would have to return an indication that it was approximated, so this could be displayed appropriately.

Adding to the argument for displaying an indication that the nonzero line is not exact: When you're not looking at stack traces, but arbitrary addresses (e.g. the instruction that caused a trap, or something) it's also not implausible that the instruction is at the top of a basic block that starts off with line 0, and you don't actually know whether control flowed in from the previous block or there was a jump to it from somewhere else. That makes the visible indication pretty important.

Re stack tracers should subtract one: The failure to do this is exactly why PS4 (and we're not the only ones) set TrapUnreachable, to (for example) make sure there is an instruction after a noreturn call that happens to be laid out as the last block in a function, and that the "return address" of that noreturn function is not the first instruction of the physically next function. Yes we could make this change in our own code, but we don't own all the code that does this kind of thing, so we choose to guarantee that the return address is at least symbolized into the correct function.

I think the fix would be somewhere around DWARFDebugLine::LineTable::getFileLineInfoForAddress where, if the resulting row (specified by RowIndex) is line zero, then the previous row could be inspected - no need to probe addresses that are in the same row and would return the same result repeatedly.