This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/CommandGuide/
-
CommandGuide/
1
llvm-symbolizer.rst
-
include/llvm/DebugInfo/Symbolize/
-
llvm/
-
DebugInfo/
-
Symbolize/
1
Symbolize.h
-
lib/DebugInfo/Symbolize/
-
DebugInfo/
-
Symbolize/
3
Symbolize.cpp
-
test/tools/llvm-symbolizer/
-
tools/
-
llvm-symbolizer/
1
approximate-missing-line-numbers-inline.s
2
approximate-missing-line-numbers.s
-
tools/llvm-symbolizer/
-
llvm-symbolizer/
-
Opts.td
1
llvm-symbolizer.cpp

Differential D120660

[llvm-symbolizer] Add --approximate-missing-line-numbers Command Line Option
Needs ReviewPublic

Authored by gbreynoo on Feb 28 2022, 6:56 AM.

Download Raw Diff

Details

Reviewers

jhenderson
MaskRay
aorlov
noajshu
mysterymath
jdoerfert
dblaikie

Summary

There are cases in which line information is missing, for example when the compiler gives no line number entry due to optimizations making a line number nebulous. When using llvm-symbolizer to debug these cases the user has to test the nearby addresses to find a valid source line number. This change adds the command line option --approximate-missing-line-numbers, in cases in which no line number is found the line number of the previous address will be output instead.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

gbreynoo created this revision.Feb 28 2022, 6:56 AM

Herald added subscribers: rupprecht, hiraditya. · View Herald TranscriptFeb 28 2022, 6:56 AM

gbreynoo requested review of this revision.Feb 28 2022, 6:56 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptFeb 28 2022, 6:56 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, sstefan1. · View Herald Transcript

To be clear this new output is an approximation for cases in which the canonical line number is 0 and so not too helpful to the user. I wanted to suggest this functionality as we have seen cases in which this new option would be useful to a user. Suggestions for a better name than --approximate-missing-line-numbers would also be appreciated.

Harbormaster completed remote builds in B151752: Diff 411801.Feb 28 2022, 7:55 AM

I've already expressed my doubts offline about this change potentially causing people to end up with misleading output. However, in and of itself, I don't think that's a reason to reject it, since the user has to opt-in, although I'd like to hear from some other opinions (@dblaikie?). However, I do think it is important for a user to be able to distinguish between an approximated address and a "definitely right" address, by the output being annotated somehow with something like "(approximate)".

llvm/docs/CommandGuide/llvm-symbolizer.rst
209–211
llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
147	Nit: no blank lines at start of functions.
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
126–127	This is unrelated to the demangling block above, so split it up.
154	Wrong function name style.
160–164	If I'm following this correctly, this ends up in a recursive loop until you find a non-zero line number or address zero? It sounds honestly rather inefficient to me, in the event of a compiler emitting a large amount of code with line 0. That being said, I don't know how often such a situation could occur. I would at least like to see a test case for multiple consecutive line 0 entries.
llvm/test/tools/llvm-symbolizer/approximate-missing-line-numbers-inline.s
7–8	I might be being dumb: is this test case supposed to be showing that --approximate-missing-line-numbers does nothing if the address is non-zero? That's fine (we should have that case), but if so, I a) don't see a non-inlined case where the option does anything (the address 16 case looks like it is the same essential case as this address 6 case).
llvm/test/tools/llvm-symbolizer/approximate-missing-line-numbers.s
1	Is there any real difference between inlined and non-inline cases? The code change is under the `symbolizeCodeCommon` function after all.
8–12	I'm not sure output style is at all relevant to this option?
llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
371	Nit: clang-format.

Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2022, 10:49 PM

Suppose we are going to add such an interface:

@jhenderson: If I'm following this correctly, this ends up in a recursive loop until you find a non-zero line number or address zero? ...

If we want to avoid the loop iteratively calling SymbolizableObjectFile::symbolizeCode: getModuleSectionIndexForAddress, DebugInfoContext->getLineInfoForAddress, and getNameFromSymbolTable will need to be taught to skip line==0 lines.
Changing getNameFromSymbolTable may be straightforward (it is basically a llvm::upper_bound call), but need to expose a new interface. Changes to the other two functions will be tricky.

Otherwise, a loop is perhaps good enough. If we want to guard against a pessimistic case that calls symbolizeCode too many times (for example, try symbolizing 0x2000 but only 0x1000 has a line number; there will be 0x1000 failed trials which can be too slow), we may need to teach the above functions to stop, or set an arbitrary trial limit. Note that the trial limit can be encoded as --approximate-missing-line-numbers=<value>, so the user will be partially responsible for making llvm-symbolizer too slow in the pessimistic cases :)

Yeah, I'm not sure this is a great way to go - we do have info in the symbolizer for where the start of the function is, is that sufficient for the use cases you've come across?

And if we're going to give an approximate answer, it shouldn't be based on the probing solution implemented here - but probably lower-level/based on walking the line table to find the nearest non-zero location instead.

The decrementing solution came up because in the cases where we see this in the wild, it's symbolizing a backtrace (or possibly just one return address) and the return address naturally doesn't point to the call instruction, it points one past the call instruction; the instruction after a call might have line 0 for various reasons, the most blatant being that it's a noreturn call. But the call instruction itself nearly always will not have line 0.
So, _in practice_, we'll basically decrement once and be done.

Obviously there can be pathological cases and we should avoid those. But a "try decrementing once only" solution seems like it would solve the practical problem, and avoid the pathological cases.

Incidentally I agree with @jhenderson that there should be some indication in the output that a result is approximate.

In D120660#3357237, @probinson wrote:

The decrementing solution came up because in the cases where we see this in the wild, it's symbolizing a backtrace (or possibly just one return address) and the return address naturally doesn't point to the call instruction, it points one past the call instruction; the instruction after a call might have line 0 for various reasons, the most blatant being that it's a noreturn call. But the call instruction itself nearly always will not have line 0.

Wait, though, that seems like a very different problem from the one described - that seems like an issue of trying to symbolize the wrong address (well, I thought stack trace reports would subtract one from addresses, but it doesn't seem lke LLVM's crash handler does that - so it does give the address/symbolized location of the return address, not the point of the call - I've certainly had discussions with folks internally who get confused by that - but maybe changing it would produce more churn/confusion, even though it's I think what people are more likely to expect in a back trace ("where am I now" not "where will I be when each of these functions return"))

So, _in practice_, we'll basically decrement once and be done.

Obviously there can be pathological cases and we should avoid those. But a "try decrementing once only" solution seems like it would solve the practical problem, and avoid the pathological cases.

I'm a bit less sure of this change/direction more generally, given this framing/motivation - like maybe decrementing once before even symbolizing would be better/more reliable/consistent in this use case? (but yeah, maybe more confusing for folks used to/more aware of the nuance of stack traces being where the program is returning to, rather than where it is currently - maybe there are further complications with tail calls, etc, which mean decrementing doesn't actually get you to the call site that reached the callee)

Still, if we're going this way, I'm not super happy with the probing approach - I /think/ it should probably be implemented at some lower level where we can walk backwards through the line table. Though, yeah, maybe that turns out to be silly complicated (especially when dealing with both the line table and the inline info from the debug_info DIEs too)

I think the encoding of the line table may make it infeasible to walk backwards. But perhaps the code that does interpret the line table could keep track of the most-recent-nonzero-line and return that in addition to (or instead of) zero, when the requested address does resolve to zero? If it's instead-of, the API would have to return an indication that it was approximated, so this could be displayed appropriately.

Adding to the argument for displaying an indication that the nonzero line is not exact: When you're not looking at stack traces, but arbitrary addresses (e.g. the instruction that caused a trap, or something) it's also not implausible that the instruction is at the top of a basic block that starts off with line 0, and you don't actually know whether control flowed in from the previous block or there was a jump to it from somewhere else. That makes the visible indication pretty important.

Re stack tracers should subtract one: The failure to do this is exactly why PS4 (and we're not the only ones) set TrapUnreachable, to (for example) make sure there is an instruction after a noreturn call that happens to be laid out as the last block in a function, and that the "return address" of that noreturn function is not the first instruction of the physically next function. Yes we could make this change in our own code, but we don't own all the code that does this kind of thing, so we choose to guarantee that the return address is at least symbolized into the correct function.

I think the fix would be somewhere around DWARFDebugLine::LineTable::getFileLineInfoForAddress where, if the resulting row (specified by RowIndex) is line zero, then the previous row could be inspected - no need to probe addresses that are in the same row and would return the same result repeatedly.

Revision Contents

Path

Size

llvm/

docs/

CommandGuide/

llvm-symbolizer.rst

7 lines

include/

llvm/

DebugInfo/

Symbolize/

Symbolize.h

5 lines

lib/

DebugInfo/

Symbolize/

Symbolize.cpp

23 lines

test/

tools/

llvm-symbolizer/

approximate-missing-line-numbers-inline.s

207 lines

approximate-missing-line-numbers.s

121 lines

tools/

llvm-symbolizer/

Opts.td

1 line

llvm-symbolizer.cpp

1 line

Diff 411801

llvm/docs/CommandGuide/llvm-symbolizer.rst

Show First 20 Lines • Show All 196 Lines • ▼ Show 20 Lines

OPTIONS

-------

.. option:: --adjust-vma <offset>

Add the specified offset to object file addresses when performing lookups.

This can be used to perform lookups as if the object were relocated by the

offset.

.. option:: --approximate-missing-line-numbers

Attempt to find an approximate line number in cases with no line number, for

example when the compiler has given no line number entry due to it

being nebulous due to optimization. The line number of the previous

address will instead be output.

jhendersonUnsubmitted

Not Done

Attempt to find an approximate line number in cases with no line number, for

- example when the compiler has given no line number entry due to it

- being nebulous due to optimization. The line number of the previous

- address will instead be output.

+ example when the compiler cannot map an instruction to a line number due to

+ optimizations. The line number of the previous address will be output instead.

.. option:: --basenames, -s

jhenderson:

.. option:: --basenames, -s

Print just the file's name without any directories, instead of the

absolute path.

.. option:: --build-id

Look up the object using the given build ID, specified as a hexadecimal

▲ Show 20 Lines • Show All 277 Lines • Show Last 20 Lines

llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	public:
struct Options {		struct Options {
FunctionNameKind PrintFunctions = FunctionNameKind::LinkageName;		FunctionNameKind PrintFunctions = FunctionNameKind::LinkageName;
FileLineInfoKind PathStyle = FileLineInfoKind::AbsoluteFilePath;		FileLineInfoKind PathStyle = FileLineInfoKind::AbsoluteFilePath;
bool UseSymbolTable = true;		bool UseSymbolTable = true;
bool Demangle = true;		bool Demangle = true;
bool RelativeAddresses = false;		bool RelativeAddresses = false;
bool UntagAddresses = false;		bool UntagAddresses = false;
bool UseDIA = false;		bool UseDIA = false;
		bool ApproximateLineNumbers = false;
std::string DefaultArch;		std::string DefaultArch;
std::vector<std::string> DsymHints;		std::vector<std::string> DsymHints;
std::string FallbackDebugPath;		std::string FallbackDebugPath;
std::string DWPName;		std::string DWPName;
std::vector<std::string> DebugFileDirectory;		std::vector<std::string> DebugFileDirectory;
size_t MaxCacheSize = sizeof(size_t) == 4		size_t MaxCacheSize = sizeof(size_t) == 4
? 512 * 1024 * 1024 /* 512 MiB */		? 512 * 1024 * 1024 /* 512 MiB */
: 4ULL * 1024 * 1024 * 1024 /* 4 GiB */;		: 4ULL * 1024 * 1024 * 1024 /* 4 GiB */;
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	private:
template <typename T>		template <typename T>
Expected<DIGlobal> symbolizeDataCommon(const T &ModuleSpecifier,		Expected<DIGlobal> symbolizeDataCommon(const T &ModuleSpecifier,
object::SectionedAddress ModuleOffset);		object::SectionedAddress ModuleOffset);
template <typename T>		template <typename T>
Expected<std::vector<DILocal>>		Expected<std::vector<DILocal>>
symbolizeFrameCommon(const T &ModuleSpecifier,		symbolizeFrameCommon(const T &ModuleSpecifier,
object::SectionedAddress ModuleOffset);		object::SectionedAddress ModuleOffset);

		void LLVMSymbolizer::ApproximateMissingLineNumber(
		SymbolizableModule *Info, object::SectionedAddress ModuleOffset,
		DILineInfo *LineInfo);

		jhendersonUnsubmitted Not Done Reply Inline Actions Nit: no blank lines at start of functions. jhenderson: Nit: no blank lines at start of functions.
/// Returns a SymbolizableModule or an error if loading debug info failed.		/// Returns a SymbolizableModule or an error if loading debug info failed.
/// Only one attempt is made to load a module, and errors during loading are		/// Only one attempt is made to load a module, and errors during loading are
/// only reported once. Subsequent calls to get module info for a module that		/// only reported once. Subsequent calls to get module info for a module that
/// failed to load will return nullptr.		/// failed to load will return nullptr.
Expected<SymbolizableModule *>		Expected<SymbolizableModule *>
getOrCreateModuleInfo(const std::string &ModuleName);		getOrCreateModuleInfo(const std::string &ModuleName);
Expected<SymbolizableModule *> getOrCreateModuleInfo(const ObjectFile &Obj);		Expected<SymbolizableModule *> getOrCreateModuleInfo(const ObjectFile &Obj);

▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

llvm/lib/DebugInfo/Symbolize/Symbolize.cpp

Show First 20 Lines • Show All 65 Lines • ▼ Show 20 Lines

LLVMSymbolizer::symbolizeCodeCommon(const T &ModuleSpecifier,

if (Opts.RelativeAddresses)

ModuleOffset.Address += Info->getModulePreferredBase();

DILineInfo LineInfo = Info->symbolizeCode(

ModuleOffset, DILineInfoSpecifier(Opts.PathStyle, Opts.PrintFunctions),

Opts.UseSymbolTable);

if (Opts.Demangle)

LineInfo.FunctionName = DemangleName(LineInfo.FunctionName, Info);

if (Opts.ApproximateLineNumbers)

ApproximateMissingLineNumber(Info, ModuleOffset, &LineInfo);

return LineInfo;

}

Expected<DILineInfo>

LLVMSymbolizer::symbolizeCode(const ObjectFile &Obj,

object::SectionedAddress ModuleOffset) {

return symbolizeCodeCommon(Obj, ModuleOffset);

}

Show All 32 Lines

Expected<DIInliningInfo> LLVMSymbolizer::symbolizeInlinedCodeCommon(

DIInliningInfo InlinedContext = Info->symbolizeInlinedCode(

ModuleOffset, DILineInfoSpecifier(Opts.PathStyle, Opts.PrintFunctions),

Opts.UseSymbolTable);

if (Opts.Demangle) {

for (int i = 0, n = InlinedContext.getNumberOfFrames(); i < n; i++) {

auto *Frame = InlinedContext.getMutableFrame(i);

Frame->FunctionName = DemangleName(Frame->FunctionName, Info);

}

if (Opts.ApproximateLineNumbers)

jhendersonUnsubmitted

Not Done

Frame->FunctionName = DemangleName(Frame->FunctionName, Info);

}

if (Opts.ApproximateLineNumbers)

ApproximateMissingLineNumber(Info, ModuleOffset,

This is unrelated to the demangling block above, so split it up.

jhenderson: This is unrelated to the demangling block above, so split it up.

ApproximateMissingLineNumber(Info, ModuleOffset,

InlinedContext.getMutableFrame(0));

return InlinedContext;

}

Expected<DIInliningInfo>

LLVMSymbolizer::symbolizeInlinedCode(const ObjectFile &Obj,

object::SectionedAddress ModuleOffset) {

return symbolizeInlinedCodeCommon(Obj, ModuleOffset);

}

Expected<DIInliningInfo>

LLVMSymbolizer::symbolizeInlinedCode(const std::string &ModuleName,

object::SectionedAddress ModuleOffset) {

return symbolizeInlinedCodeCommon(ModuleName, ModuleOffset);

}

Expected<DIInliningInfo>

LLVMSymbolizer::symbolizeInlinedCode(ArrayRef<uint8_t> BuildID,

object::SectionedAddress ModuleOffset) {

return symbolizeInlinedCodeCommon(BuildID, ModuleOffset);

}

// In cases in which no line is matched to an address, for example due to

// compiler optimization, look at the previous address.

void LLVMSymbolizer::ApproximateMissingLineNumber(

jhendersonUnsubmitted

Not Done

// compiler optimization, look at the previous address.

- void LLVMSymbolizer::ApproximateMissingLineNumber(

+ void LLVMSymbolizer::approximateMissingLineNumber(

SymbolizableModule *Info, object::SectionedAddress ModuleOffset,

Wrong function name style.

jhenderson: Wrong function name style.

SymbolizableModule *Info, object::SectionedAddress ModuleOffset,

DILineInfo *LineInfo) {

if (LineInfo->Line != 0 || ModuleOffset.Address == 0)

return;

--ModuleOffset.Address;

DILineInfo ApproxLineInfo = Info->symbolizeCode(

ModuleOffset, DILineInfoSpecifier(Opts.PathStyle, Opts.PrintFunctions),

Opts.UseSymbolTable);

LineInfo->Line = ApproxLineInfo.Line;

jhendersonUnsubmitted

Not Done

If I'm following this correctly, this ends up in a recursive loop until you find a non-zero line number or address zero? It sounds honestly rather inefficient to me, in the event of a compiler emitting a large amount of code with line 0. That being said, I don't know how often such a situation could occur. I would at least like to see a test case for multiple consecutive line 0 entries.

jhenderson: If I'm following this correctly, this ends up in a recursive loop until you find a non-zero…

}

template <typename T>

Expected<DIGlobal>

LLVMSymbolizer::symbolizeDataCommon(const T &ModuleSpecifier,

object::SectionedAddress ModuleOffset) {

auto InfoOrErr = getOrCreateModuleInfo(ModuleSpecifier);

if (!InfoOrErr)

return InfoOrErr.takeError();

▲ Show 20 Lines • Show All 625 Lines • Show Last 20 Lines

llvm/test/tools/llvm-symbolizer/approximate-missing-line-numbers-inline.s

This file was added.

				# REQUIRES: x86-registered-target

				# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o -g

				# RUN: llvm-symbolizer --obj=%t.o 0x0000000000000000 \| FileCheck %s -DLINE=0
				# RUN: llvm-symbolizer --obj=%t.o 0x0000000000000000 --approximate-missing-line-numbers \| FileCheck %s -DLINE=0
				# RUN: llvm-symbolizer --obj=%t.o 0x0000000000000006 \| FileCheck %s -DLINE=4
				# RUN: llvm-symbolizer --obj=%t.o 0x0000000000000006 --approximate-missing-line-numbers \| FileCheck %s -DLINE=4
				jhendersonUnsubmitted Not Done Reply Inline Actions I might be being dumb: is this test case supposed to be showing that --approximate-missing-line-numbers does nothing if the address is non-zero? That's fine (we should have that case), but if so, I a) don't see a non-inlined case where the option does anything (the address 16 case looks like it is the same essential case as this address 6 case). jhenderson: I might be being dumb: is this test case supposed to be showing that --approximate-missing-line…
				# RUN: llvm-symbolizer --obj=%t.o 0x0000000000000010 \| FileCheck %s -DLINE1=0 -DLINE2=8 --check-prefix=INLINED
				# RUN: llvm-symbolizer --obj=%t.o 0x0000000000000010 --approximate-missing-line-numbers \| FileCheck %s -DLINE1=4 -DLINE2=8 --check-prefix=INLINED
				# RUN: llvm-symbolizer --obj=%t.o 0x0000000000000016 \| FileCheck %s -DLINE=8
				# RUN: llvm-symbolizer --obj=%t.o 0x0000000000000016 --approximate-missing-line-numbers \| FileCheck %s -DLINE=8

				# CHECK: location:[[LINE]]
				# INLINED: location:[[LINE1]]
				# INLINED: location:[[LINE2]]

				## Built from the following source with
				## clang -target x86_64-pc-linux -O3 -g -S -gline-tables-only
				## and editing the marked .loc instructions
				## int foo = 0;
				##
				## int bar () {
				## return foo;
				## }
				##
				## int main() {
				## return bar();
				## }

				.text
				.file "test.c"
				.file 0 "location"
				.globl bar
				.p2align 4, 0x90
				.type bar,@function
				bar:
				.Lfunc_begin0:
				.loc 0 3 0
				.cfi_startproc
				.loc 0 0 11 prologue_end # Set line to 0
				movl foo(%rip), %eax
				.loc 0 4 4 is_stmt 0
				retq
				.Ltmp0:
				.Lfunc_end0:
				.size bar, .Lfunc_end0-bar
				.cfi_endproc
				.globl main
				.p2align 4, 0x90
				.type main,@function
				main:
				.Lfunc_begin1:
				.loc 0 7 0 is_stmt 1
				.cfi_startproc
				.loc 0 0 11 prologue_end # Set line to 0
				movl foo(%rip), %eax
				.Ltmp1:
				.loc 0 8 3
				retq
				.Ltmp2:
				.Lfunc_end1:
				.size main, .Lfunc_end1-main
				.cfi_endproc
				.type foo,@object
				.bss
				.globl foo
				.p2align 2
				foo:
				.long 0
				.size foo, 4

				.section .debug_abbrev,"",@progbits
				.byte 1 # Abbreviation Code
				.byte 17 # DW_TAG_compile_unit
				.byte 1 # DW_CHILDREN_yes
				.byte 37 # DW_AT_producer
				.byte 37 # DW_FORM_strx1
				.byte 19 # DW_AT_language
				.byte 5 # DW_FORM_data2
				.byte 3 # DW_AT_name
				.byte 37 # DW_FORM_strx1
				.byte 114 # DW_AT_str_offsets_base
				.byte 23 # DW_FORM_sec_offset
				.byte 16 # DW_AT_stmt_list
				.byte 23 # DW_FORM_sec_offset
				.byte 27 # DW_AT_comp_dir
				.byte 37 # DW_FORM_strx1
				.byte 17 # DW_AT_low_pc
				.byte 27 # DW_FORM_addrx
				.byte 18 # DW_AT_high_pc
				.byte 6 # DW_FORM_data4
				.byte 115 # DW_AT_addr_base
				.byte 23 # DW_FORM_sec_offset
				.byte 0 # EOM(1)
				.byte 0 # EOM(2)
				.byte 2 # Abbreviation Code
				.byte 46 # DW_TAG_subprogram
				.byte 0 # DW_CHILDREN_no
				.byte 3 # DW_AT_name
				.byte 37 # DW_FORM_strx1
				.byte 32 # DW_AT_inline
				.byte 33 # DW_FORM_implicit_const
				.byte 1
				.byte 0 # EOM(1)
				.byte 0 # EOM(2)
				.byte 3 # Abbreviation Code
				.byte 46 # DW_TAG_subprogram
				.byte 1 # DW_CHILDREN_yes
				.byte 17 # DW_AT_low_pc
				.byte 27 # DW_FORM_addrx
				.byte 18 # DW_AT_high_pc
				.byte 6 # DW_FORM_data4
				.byte 122 # DW_AT_call_all_calls
				.byte 25 # DW_FORM_flag_present
				.byte 3 # DW_AT_name
				.byte 37 # DW_FORM_strx1
				.byte 0 # EOM(1)
				.byte 0 # EOM(2)
				.byte 4 # Abbreviation Code
				.byte 29 # DW_TAG_inlined_subroutine
				.byte 0 # DW_CHILDREN_no
				.byte 49 # DW_AT_abstract_origin
				.byte 19 # DW_FORM_ref4
				.byte 17 # DW_AT_low_pc
				.byte 27 # DW_FORM_addrx
				.byte 18 # DW_AT_high_pc
				.byte 6 # DW_FORM_data4
				.byte 88 # DW_AT_call_file
				.byte 11 # DW_FORM_data1
				.byte 89 # DW_AT_call_line
				.byte 11 # DW_FORM_data1
				.byte 87 # DW_AT_call_column
				.byte 11 # DW_FORM_data1
				.byte 0 # EOM(1)
				.byte 0 # EOM(2)
				.byte 0 # EOM(3)
				.section .debug_info,"",@progbits
				.Lcu_begin0:
				.long .Ldebug_info_end0-.Ldebug_info_start0 # Length of Unit
				.Ldebug_info_start0:
				.short 5 # DWARF version number
				.byte 1 # DWARF Unit Type
				.byte 8 # Address Size (in bytes)
				.long .debug_abbrev # Offset Into Abbrev. Section
				.byte 1 # Abbrev [1] 0xc:0x2f DW_TAG_compile_unit
				.byte 0 # DW_AT_producer
				.short 12 # DW_AT_language
				.byte 1 # DW_AT_name
				.long .Lstr_offsets_base0 # DW_AT_str_offsets_base
				.long .Lline_table_start0 # DW_AT_stmt_list
				.byte 2 # DW_AT_comp_dir
				.byte 0 # DW_AT_low_pc
				.long .Lfunc_end1-.Lfunc_begin0 # DW_AT_high_pc
				.long .Laddr_table_base0 # DW_AT_addr_base
				.byte 2 # Abbrev [2] 0x23:0x2 DW_TAG_subprogram
				.byte 3 # DW_AT_name
				# DW_AT_inline
				.byte 3 # Abbrev [3] 0x25:0x15 DW_TAG_subprogram
				.byte 1 # DW_AT_low_pc
				.long .Lfunc_end1-.Lfunc_begin1 # DW_AT_high_pc
				# DW_AT_call_all_calls
				.byte 4 # DW_AT_name
				.byte 4 # Abbrev [4] 0x2c:0xd DW_TAG_inlined_subroutine
				.long 35 # DW_AT_abstract_origin
				.byte 1 # DW_AT_low_pc
				.long .Ltmp1-.Lfunc_begin1 # DW_AT_high_pc
				.byte 0 # DW_AT_call_file
				.byte 8 # DW_AT_call_line
				.byte 10 # DW_AT_call_column
				.byte 0 # End Of Children Mark
				.byte 0 # End Of Children Mark
				.Ldebug_info_end0:
				.section .debug_str_offsets,"",@progbits
				.long 24 # Length of String Offsets Set
				.short 5
				.short 0
				.Lstr_offsets_base0:
				.section .debug_str,"MS",@progbits,1
				.Linfo_string0:
				.asciz "clang version 15.0.0 (https://github.com/llvm/llvm-project.git 7dce12de68880fe7fb124afaf5bcf7671229cfc0)"
				.Linfo_string1:
				.asciz "temp.c"
				.Linfo_string2:
				.asciz "location"
				.Linfo_string3:
				.asciz "bar"
				.Linfo_string4:
				.asciz "main"
				.section .debug_str_offsets,"",@progbits
				.long .Linfo_string0
				.long .Linfo_string1
				.long .Linfo_string2
				.long .Linfo_string3
				.long .Linfo_string4
				.section .debug_addr,"",@progbits
				.long .Ldebug_addr_end0-.Ldebug_addr_start0 # Length of contribution
				.Ldebug_addr_start0:
				.short 5 # DWARF version number
				.byte 8 # Address size
				.byte 0 # Segment selector size
				.Laddr_table_base0:
				.quad .Lfunc_begin0
				.quad .Lfunc_begin1
				.Ldebug_addr_end0:
				.section .debug_line,"",@progbits
				.Lline_table_start0:

llvm/test/tools/llvm-symbolizer/approximate-missing-line-numbers.s

This file was added.

				# REQUIRES: x86-registered-target
				jhendersonUnsubmitted Not Done Reply Inline Actions Is there any real difference between inlined and non-inline cases? The code change is under the `symbolizeCodeCommon` function after all. jhenderson: Is there any real difference between inlined and non-inline cases? The code change is under the…

				# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o -g

				# RUN: llvm-symbolizer --obj=%t.o 0x0000000000000006 \| FileCheck %s -DLINE=0
				# RUN: llvm-symbolizer --approximate-missing-line-numbers --obj=%t.o 0x0000000000000006 \| FileCheck %s -DLINE=4

				# RUN: llvm-symbolizer --obj=%t.o 0x0000000000000006 --output-style=GNU --no-inlines \| FileCheck %s -DLINE=0
				# RUN: llvm-symbolizer --approximate-missing-line-numbers --obj=%t.o 0x0000000000000006 --output-style=GNU --no-inlines \| FileCheck %s -DLINE=4

				# RUN: llvm-symbolizer --obj=%t.o 0x0000000000000006 --output-style=JSON --no-inlines \| FileCheck %s --check-prefix=JSON -DLINE=0
				# RUN: llvm-symbolizer --approximate-missing-line-numbers --obj=%t.o 0x0000000000000006 --output-style=JSON --no-inlines \| FileCheck %s --check-prefix=JSON -DLINE=4
				jhendersonUnsubmitted Not Done Reply Inline Actions I'm not sure output style is at all relevant to this option? jhenderson: I'm not sure output style is at all relevant to this option?

				# CHECK: location:[[LINE]]
				# JSON: "Line":[[LINE]]

				## Built from the following source with
				## clang -target x86_64-pc-linux -O3 -g -S
				## and editing the marked .loc instructions
				## int foo = 0;
				##
				## int main() {
				## return foo;
				## }

				.text
				.file 0 "location"
				.globl main
				.p2align 4, 0x90
				.type main,@function
				main:
				.Lfunc_begin0:
				.loc 0 3 0
				.cfi_startproc
				.loc 0 4 10 prologue_end
				movl foo(%rip), %eax
				.loc 0 0 3 is_stmt 0 # Set line to 0
				retq
				.Ltmp0:
				.Lfunc_end0:
				.size main, .Lfunc_end0-main
				.cfi_endproc
				.type foo,@object
				.bss
				.globl foo
				.p2align 2
				foo:
				.long 0
				.size foo, 4

				.section .debug_abbrev,"",@progbits
				.byte 1 # Abbreviation Code
				.byte 17 # DW_TAG_compile_unit
				.byte 0 # DW_CHILDREN_no
				.byte 37 # DW_AT_producer
				.byte 37 # DW_FORM_strx1
				.byte 19 # DW_AT_language
				.byte 5 # DW_FORM_data2
				.byte 3 # DW_AT_name
				.byte 37 # DW_FORM_strx1
				.byte 114 # DW_AT_str_offsets_base
				.byte 23 # DW_FORM_sec_offset
				.byte 16 # DW_AT_stmt_list
				.byte 23 # DW_FORM_sec_offset
				.byte 27 # DW_AT_comp_dir
				.byte 37 # DW_FORM_strx1
				.byte 17 # DW_AT_low_pc
				.byte 27 # DW_FORM_addrx
				.byte 18 # DW_AT_high_pc
				.byte 6 # DW_FORM_data4
				.byte 115 # DW_AT_addr_base
				.byte 23 # DW_FORM_sec_offset
				.byte 0 # EOM(1)
				.byte 0 # EOM(2)
				.byte 0 # EOM(3)
				.section .debug_info,"",@progbits
				.Lcu_begin0:
				.long .Ldebug_info_end0-.Ldebug_info_start0 # Length of Unit
				.Ldebug_info_start0:
				.short 5 # DWARF version number
				.byte 1 # DWARF Unit Type
				.byte 8 # Address Size (in bytes)
				.long .debug_abbrev # Offset Into Abbrev. Section
				.byte 1 # Abbrev [1] 0xc:0x17 DW_TAG_compile_unit
				.byte 0 # DW_AT_producer
				.short 12 # DW_AT_language
				.byte 1 # DW_AT_name
				.long .Lstr_offsets_base0 # DW_AT_str_offsets_base
				.long .Lline_table_start0 # DW_AT_stmt_list
				.byte 2 # DW_AT_comp_dir
				.byte 0 # DW_AT_low_pc
				.long .Lfunc_end0-.Lfunc_begin0 # DW_AT_high_pc
				.long .Laddr_table_base0 # DW_AT_addr_base
				.Ldebug_info_end0:
				.section .debug_str_offsets,"",@progbits
				.long 16 # Length of String Offsets Set
				.short 5
				.short 0
				.Lstr_offsets_base0:
				.section .debug_str,"MS",@progbits,1
				.Linfo_string0:
				.asciz "clang version 15.0.0 (https://github.com/llvm/llvm-project.git 7dce12de68880fe7fb124afaf5bcf7671229cfc0)"
				.Linfo_string1:
				.asciz "temp.c"
				.Linfo_string2:
				.asciz "location"
				.section .debug_str_offsets,"",@progbits
				.long .Linfo_string0
				.long .Linfo_string1
				.long .Linfo_string2
				.section .debug_addr,"",@progbits
				.long .Ldebug_addr_end0-.Ldebug_addr_start0 # Length of contribution
				.Ldebug_addr_start0:
				.short 5 # DWARF version number
				.byte 8 # Address size
				.byte 0 # Segment selector size
				.Laddr_table_base0:
				.quad .Lfunc_begin0
				.Ldebug_addr_end0:
				.section .debug_line,"",@progbits
				.Lline_table_start0:

llvm/tools/llvm-symbolizer/Opts.td

	Show All 14 Lines

	def grp_mach_o : OptionGroup<"kind">,			def grp_mach_o : OptionGroup<"kind">,
	HelpText<"llvm-symbolizer Mach-O Specific Options">;			HelpText<"llvm-symbolizer Mach-O Specific Options">;

	def addresses : F<"addresses", "Show address before line information">;			def addresses : F<"addresses", "Show address before line information">;
	defm adjust_vma			defm adjust_vma
	: Eq<"adjust-vma", "Add specified offset to object file addresses">,			: Eq<"adjust-vma", "Add specified offset to object file addresses">,
	MetaVarName<"<offset>">;			MetaVarName<"<offset>">;
				def approximate_missing_line_numbers : F<"approximate-missing-line-numbers", "Find an approximate line number in cases with no line number">;
	def basenames : Flag<["--"], "basenames">, HelpText<"Strip directory names from paths">;			def basenames : Flag<["--"], "basenames">, HelpText<"Strip directory names from paths">;
	defm build_id : Eq<"build-id", "Build ID used to look up the object file">;			defm build_id : Eq<"build-id", "Build ID used to look up the object file">;
	defm cache_size : Eq<"cache-size", "Max size in bytes of the in-memory binary cache.">;			defm cache_size : Eq<"cache-size", "Max size in bytes of the in-memory binary cache.">;
	defm debug_file_directory : Eq<"debug-file-directory", "Path to directory where to look for debug files">, MetaVarName<"<dir>">;			defm debug_file_directory : Eq<"debug-file-directory", "Path to directory where to look for debug files">, MetaVarName<"<dir>">;
	defm debuginfod : B<"debuginfod", "Use debuginfod to find debug binaries", "Don't use debuginfod to find debug binaries">;			defm debuginfod : B<"debuginfod", "Use debuginfod to find debug binaries", "Don't use debuginfod to find debug binaries">;
	defm default_arch			defm default_arch
	: Eq<"default-arch", "Default architecture (for multi-arch objects)">,			: Eq<"default-arch", "Default architecture (for multi-arch objects)">,
	Group<grp_mach_o>;			Group<grp_mach_o>;
	▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp

Show First 20 Lines • Show All 362 Lines • ▼ Show 20 Lines	int main(int argc, char **argv) {
StringSaver Saver(A);		StringSaver Saver(A);
SymbolizerOptTable Tbl;		SymbolizerOptTable Tbl;
opt::InputArgList Args = parseOptions(argc, argv, IsAddr2Line, Saver, Tbl);		opt::InputArgList Args = parseOptions(argc, argv, IsAddr2Line, Saver, Tbl);

LLVMSymbolizer::Options Opts;		LLVMSymbolizer::Options Opts;
uint64_t AdjustVMA;		uint64_t AdjustVMA;
PrinterConfig Config;		PrinterConfig Config;
parseIntArg(Args, OPT_adjust_vma_EQ, AdjustVMA);		parseIntArg(Args, OPT_adjust_vma_EQ, AdjustVMA);
		Opts.ApproximateLineNumbers = Args.hasArg(OPT_approximate_missing_line_numbers);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Opts.ApproximateLineNumbers = Args.hasArg(OPT_approximate_missing_line_numbers); + Opts.ApproximateLineNumbers = + Args.hasArg(OPT_approximate_missing_line_numbers); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Opts.ApproximateLineNumbers = Args.hasArg…
		jhendersonUnsubmitted Not Done Reply Inline Actions Nit: clang-format. jhenderson: Nit: clang-format.
if (const opt::Arg *A = Args.getLastArg(OPT_basenames, OPT_relativenames)) {		if (const opt::Arg *A = Args.getLastArg(OPT_basenames, OPT_relativenames)) {
Opts.PathStyle =		Opts.PathStyle =
A->getOption().matches(OPT_basenames)		A->getOption().matches(OPT_basenames)
? DILineInfoSpecifier::FileLineInfoKind::BaseNameOnly		? DILineInfoSpecifier::FileLineInfoKind::BaseNameOnly
: DILineInfoSpecifier::FileLineInfoKind::RelativeFilePath;		: DILineInfoSpecifier::FileLineInfoKind::RelativeFilePath;
} else {		} else {
Opts.PathStyle = DILineInfoSpecifier::FileLineInfoKind::AbsoluteFilePath;		Opts.PathStyle = DILineInfoSpecifier::FileLineInfoKind::AbsoluteFilePath;
}		}
▲ Show 20 Lines • Show All 96 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-symbolizer] Add --approximate-missing-line-numbers Command Line OptionNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 411801

llvm/docs/CommandGuide/llvm-symbolizer.rst

llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h

llvm/lib/DebugInfo/Symbolize/Symbolize.cpp

llvm/test/tools/llvm-symbolizer/approximate-missing-line-numbers-inline.s

llvm/test/tools/llvm-symbolizer/approximate-missing-line-numbers.s

llvm/tools/llvm-symbolizer/Opts.td

llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp

[llvm-symbolizer] Add --approximate-missing-line-numbers Command Line Option
Needs ReviewPublic