Download Raw Diff

Details

Reviewers

ruiu
• espindola

Commits

rGd621037788c3: [ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)
rLLD331972: [ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)
rL331972: [ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)

Summary

Feedback on D44382 suggested to change the interface to use a callback instead of the structures used. As I didn't want to effectively overwrite that diff, I created D44560 as an alternative implementation using the suggested approach. This is the corresponding patch required for LLD assuming D44560 were to be applied.

As with D44382, D44560 changes the debug line parser interface to report LLVM errors in an interface that different executables can use, rather than always being printed directly as warnings to stderr. This change allows LLD to make use of the new interface and call its own warning methods to report problems.

To test this, I have extended the bad-debug undefined symbol message case to show that a corresponding warning is printed, if the debug line cannot be parsed. In addition, I have also added tests for LLD attempting to parse a non-existent/empty debug line section, showing that the new warning for attempting to do this is not emitted.

Diff Detail

Repository: rL LLVM

Event Timeline

jhenderson created this revision.Mar 16 2018, 6:54 AM

jhenderson mentioned this in D44560: [DWARF] Rework debug line parsing to use llvm::Error and callbacks.

jhenderson mentioned this in D44388: [ELF] Rework debug line parsing to use llvm::Error (LLD-side).Mar 19 2018, 2:26 AM

grimar added inline comments.Mar 19 2018, 2:46 AM

ELF/InputFiles.cpp
128 ↗	(On Diff #138691)	I would move this comment above `isValidOffset(0)` and update it as you are doing something with offset 0 right there already.
136 ↗	(On Diff #138691)	Is it useful to `warn` here? I think we only call this method on linkage error, so anyways going to terminate the link. With that, it seems easier to always error out here to simplify code/logic.

jhenderson added inline comments.Mar 19 2018, 3:28 AM

ELF/InputFiles.cpp
128 ↗	(On Diff #138691)	Actually, the comment (and code) is technically wrong - it is possible for there to be multiple CUs in a single object input. I'm thinking objects built via LTO or -r links. I'll add a FIXME and file a bug, along with updating the comment.
136 ↗	(On Diff #138691)	I'm currently preserving behaviour (or fixing intended behaviour, at least), though I certainly see where you're coming from. In general we try to emit as many errors/warnings as we can before stopping, so I still think it's useful - it might help the user identify some other problem, for example. I'm not bothered either way though, so if the consensus is to not emit the warning, I can easily change that.

grimar added inline comments.Mar 19 2018, 3:33 AM

ELF/InputFiles.cpp
136 ↗	(On Diff #138691)	I meant we always can `error` here instead of `warn` for all cases I think.

grimar added inline comments.Mar 19 2018, 3:34 AM

ELF/InputFiles.cpp
136 ↗	(On Diff #138691)	As at this point, we are already in error state.

Update comment.

ELF/InputFiles.cpp
136 ↗	(On Diff #138691)	Right, but that would suggest to the user that fixing the malformed debug line is a requirement to get a working link, which it isn't. Perhaps another way to think about it is if we wanted to introduce a new warning that could use source information. We'd want to use this function too, but we shouldn't stop the link succeeding if we can't parse the line table. I could actually make a case for making the second one a warning too, but I made that an error, because we don't actually ever expect it to be reported currently.

JDevlieghere added a subscriber: JDevlieghere.Mar 19 2018, 8:21 AM

• espindola added inline comments.Mar 19 2018, 2:24 PM

test/ELF/undef.s
37 ↗	(On Diff #138892)	I think I agree with George about producing an error on corrupted output. In practice we expect to never see it, so we may as well make the code simpler.

• espindola added inline comments.Mar 19 2018, 2:25 PM

ELF/InputFiles.cpp
130 ↗	(On Diff #138892)	When do we expect to have LineData that doesn't include the current CU?

David Blaikie via llvm-commits <llvm-commits@lists.llvm.org> writes:

@rafael on the mailing list:

Now that I think of it, the fact that lld could not parse the debug info to provide a better error message should always be a note, not an error, regardless of how broken the section happens to be.

To be clear, are you saying you're happy with the warnings for now, or would you prefer to drop the severity to just use "message()"? Also, what do you think about the generic error type which currently emits an error? I'd be happy to change the severity of that one too. It's currently an error because we don't expect an Error of that type ever to be reported.

ELF/InputFiles.cpp
130 ↗	(On Diff #138892)	LineData contains the contents of the .debug_line section. It is possible to have an empty section, or a missing section, in which case there will be no valid contents in LineData. Previously, when LLD requested parsing of this section, the parser would return false immediately, because the offset was invalid. With the changes in D44560, an Error is now returned saying the offset is invalid. In addition, since an object file can contain multiple CUs (e.g. via LTO or -r links), some of which might be missing debug data and others not, we can't know for certain using this method that offset 0 is the offset of the CU for the corresponding symbol (see the reproducible in PR36793).

In D44562#1042990, @jhenderson wrote:

@rafael on the mailing list:

Now that I think of it, the fact that lld could not parse the debug info to provide a better error message should always be a note, not an error, regardless of how broken the section happens to be.

To be clear, are you saying you're happy with the warnings for now, or would you prefer to drop the severity to just use "message()"? Also, what do you think about the generic error type which currently emits an error? I'd be happy to change the severity of that one too. It's currently an error because we don't expect an Error of that type ever to be reported.

Using message() in all cases seems better. We know the link is failing for unrelated reasons and we are just letting the user know why we are not printing better error messages (could not parse the debug info).

In D44562#1043904, @espindola wrote:

Using message() in all cases seems better. We know the link is failing for unrelated reasons and we are just letting the user know why we are not printing better error messages (could not parse the debug info).

message() prints to stdout, which I hadn't realised earlier. This clearly should be printed to stderr, for which there is no mechanism currently, if we don't want a warning or error. Thinking about it more, the rest of LLVM has a concept of "remarks" which are "lower" severity messages than errors or warnings (and can't be promoted to errors, as I understand it). How about I add this to LLD? Alternatively, I'll need to add a note() function, or just print to errs() directly.

Add a "remark" method to the LLD ErrorHandler, and use that instead of warnings to report problems with the debug line. Also added another case to the testing, to demonstrate failures via the callback in contrast to failures via the return value.

Tentative LGTM pending the llvm change.

This revision is now accepted and ready to land.Mar 26 2018, 6:52 PM

Thanks @espindola. I'll still need to rebase this once I've figured out what to do following rLLD328284 (see also my comments in D44560). I don't anticipate this changing from using remark() though. It's just how that function gets called.

Rebase following rLLD328284. I was able to drop one of the test inputs, now that LLD can handle multiple CUs in the same input file. Also added a comment to the test explaining the purpose of the zed6/zed7 test cases.

This revision is now accepted and ready to land.Mar 28 2018, 7:01 AM

@espindola/@ruiu, are you happy with the latest update?

ruiu added inline comments.Mar 28 2018, 1:54 PM

Common/ErrorHandler.cpp
86–91 ↗	(On Diff #140069)	Honestly I don't think we want to add a new level of error messages, as I think error() for errors, warning() for warnings and message() for non-error messages is enough. If something needs to be fixed, it should be an error or a warning. If something doesn't have to be fixed, lld shouldn't print out anything. A situation like "verbose messages are printed out but you can ignore them" isn't actionable and thus not desirable. So, can you choose either (1) show it as a warning or (2) don't show anything?

jhenderson added inline comments.Mar 29 2018, 1:42 AM

Common/ErrorHandler.cpp
86–91 ↗	(On Diff #140069)	Okay, I have no problem making it a warning. I think this might be a safer course than not printing anything. If undefined symbol messages are incomplete, due to a problem in the input, it might be possible for users to fix it, depending on the nature of the problem (e.g. bogus hand-written debug_line assembly), and therefore allowing them to see the useful error information, allowing them to more quickly identify the problem. I could also see us potentially using this function in the future to produce other informational messages, or implement other features, without always emitting an error, so a warning would be appropriate here too.

Change back to using warnings.

LGTM. I will review the LLVM side too.

This revision is now accepted and ready to land.Mar 29 2018, 3:38 PM

jhenderson edited the summary of this revision. (Show Details)May 10 2018, 3:52 AM

jhenderson removed a subscriber: • rafael.

Closed by commit rL331972: [ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side) (authored by jhenderson). · Explain WhyMay 10 2018, 3:56 AM

This revision was automatically updated to reflect the committed changes.

Diff 146109

lld/trunk/ELF/InputFiles.cpp

	Show First 20 Lines • Show All 125 Lines • ▼ Show 20 Lines
	template <class ELFT> void ObjFile<ELFT>::initializeDwarf() {			template <class ELFT> void ObjFile<ELFT>::initializeDwarf() {
	Dwarf = llvm::make_unique<DWARFContext>(make_unique<LLDDwarfObj<ELFT>>(this));			Dwarf = llvm::make_unique<DWARFContext>(make_unique<LLDDwarfObj<ELFT>>(this));
	const DWARFObject &Obj = Dwarf->getDWARFObj();			const DWARFObject &Obj = Dwarf->getDWARFObj();
	DwarfLine.reset(new DWARFDebugLine);			DwarfLine.reset(new DWARFDebugLine);
	DWARFDataExtractor LineData(Obj, Obj.getLineSection(), Config->IsLE,			DWARFDataExtractor LineData(Obj, Obj.getLineSection(), Config->IsLE,
	Config->Wordsize);			Config->Wordsize);

	for (std::unique_ptr<DWARFCompileUnit> &CU : Dwarf->compile_units()) {			for (std::unique_ptr<DWARFCompileUnit> &CU : Dwarf->compile_units()) {
	const DWARFDebugLine::LineTable *LT = Dwarf->getLineTableForUnit(CU.get());			Expected<const DWARFDebugLine::LineTable *> ExpectedLT =
				Dwarf->getLineTableForUnit(CU.get(), warn);
				const DWARFDebugLine::LineTable *LT = nullptr;
				if (ExpectedLT)
				LT = *ExpectedLT;
				else
				handleAllErrors(ExpectedLT.takeError(),
				[](ErrorInfoBase &Err) { warn(Err.message()); });
	if (!LT)			if (!LT)
	continue;			continue;
	LineTables.push_back(LT);			LineTables.push_back(LT);

	// Loop over variable records and insert them to VariableLoc.			// Loop over variable records and insert them to VariableLoc.
	for (const auto &Entry : CU->dies()) {			for (const auto &Entry : CU->dies()) {
	DWARFDie Die(CU.get(), &Entry);			DWARFDie Die(CU.get(), &Entry);
	// Skip all tags that are not variables.			// Skip all tags that are not variables.
	▲ Show 20 Lines • Show All 1,120 Lines • Show Last 20 Lines

lld/trunk/test/ELF/Inputs/undef-bad-debug.s

	.section .text,"ax"			.section .text,"ax"
	sym:			sym:
	.quad zed6			.quad zed6
				sym2:
				.quad zed7

				.section .debug_line,"",@progbits
				.Lunit:
				.long .Lunit_end - .Lunit_start # unit length
				.Lunit_start:
				.short 4 # version
				.long .Lprologue_end - .Lprologue_start # prologue length
				.Lprologue_start:
				.byte 1 # minimum instruction length
				.byte 1 # maximum operatiosn per instruction
				.byte 1 # default is_stmt
				.byte -5 # line base
				.byte 14 # line range
				.byte 13 # opcode base
				.byte 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1 # standard opcode lengths
				.asciz "dir" # include directories
				.byte 0
				.asciz "undef-bad-debug.s" # file names
				.byte 1, 0, 0
				.byte 0
				.byte 0 # extraneous byte
				.Lprologue_end:
				.byte 0, 9, 2 # DW_LNE_set_address
				.quad sym
				.byte 3 # DW_LNS_advance_line
				.byte 10
				.byte 1 # DW_LNS_copy
				.byte 2 # DW_LNS_advance_pc
				.byte 8
				.byte 0, 1, 1 # DW_LNE_end_sequence
				.Lunit_end:

				.Lunit2:
				.long .Lunit2_end - .Lunit2_start # unit length
				.Lunit2_start:
				.short 4 # version
				.long .Lprologue2_end - .Lprologue2_start # prologue length
				.Lprologue2_start:
				.byte 1 # minimum instruction length
				.byte 1 # maximum operatiosn per instruction
				.byte 1 # default is_stmt
				.byte -5 # line base
				.byte 14 # line range
				.byte 13 # opcode base
				.byte 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1 # standard opcode lengths
				.asciz "dir2" # include directories
				.byte 0
				.asciz "undef-bad-debug2.s" # file names
				.byte 1, 0, 0
				.byte 0
				.Lprologue2_end:
				.byte 0, 9, 2 # DW_LNE_set_address
				.quad sym2
				.byte 3 # DW_LNS_advance_line
				.byte 10
				.byte 1 # DW_LNS_copy
				.byte 2 # DW_LNS_advance_pc
				.byte 8
				.byte 0, 1, 1 # DW_LNE_end_sequence
				.byte 0, 9, 2 # DW_LNE_set_address
				.quad 0x0badbeef
				.byte 3 # DW_LNS_advance_line
				.byte 99
				.byte 1 # DW_LNS_copy
				.byte 99 # DW_LNS_advance_pc
				.byte 119
				# Missing end of sequence.
				.Lunit2_end:

	.section .debug_info,"",@progbits			.section .debug_info,"",@progbits
	.long .Lcu_end - .Lcu_start # Length of Unit			.long .Lcu_end - .Lcu_start # Length of Unit
	.Lcu_start:			.Lcu_start:
	.short 4 # DWARF version number			.short 4 # DWARF version number
	.long .Lsection_abbrev # Offset Into Abbrev. Section			.long .Lsection_abbrev # Offset Into Abbrev. Section
	.byte 8 # Address Size (in bytes)			.byte 8 # Address Size (in bytes)
	.byte 1 # Abbrev [1] 0xb:0x79 DW_TAG_compile_unit			.byte 1 # Abbrev [1] 0xb:0x79 DW_TAG_compile_unit
				.long .Lunit # DW_AT_stmt_list
	.byte 2 # Abbrev [2] 0x2a:0x15 DW_TAG_variable			.byte 2 # Abbrev [2] 0x2a:0x15 DW_TAG_variable
	.long .Linfo_string # DW_AT_name			.long .Linfo_string # DW_AT_name
	# DW_AT_external			# DW_AT_external
	.byte 1 # DW_AT_decl_file			.byte 1 # DW_AT_decl_file
	.byte 3 # DW_AT_decl_line			.byte 3 # DW_AT_decl_line
	.byte 0 # End Of Children Mark			.byte 0 # End Of Children Mark
	.Lcu_end:			.Lcu_end:

				.long .Lcu2_end - .Lcu2_start # Length of Unit
				.Lcu2_start:
				.short 4 # DWARF version number
				.long .Lsection_abbrev # Offset Into Abbrev. Section
				.byte 8 # Address Size (in bytes)
				.byte 1 # Abbrev [1] 0xb:0x79 DW_TAG_compile_unit
				.long .Lunit2 # DW_AT_stmt_list
				.byte 2 # Abbrev [2] 0x2a:0x15 DW_TAG_variable
				.long .Linfo2_string # DW_AT_name
				# DW_AT_external
				.byte 1 # DW_AT_decl_file
				.byte 3 # DW_AT_decl_line
				.byte 0 # End Of Children Mark
				.Lcu2_end:

	.section .debug_abbrev,"",@progbits			.section .debug_abbrev,"",@progbits
	.Lsection_abbrev:			.Lsection_abbrev:
	.byte 1 # Abbreviation Code			.byte 1 # Abbreviation Code
	.byte 17 # DW_TAG_compile_unit			.byte 17 # DW_TAG_compile_unit
	.byte 1 # DW_CHILDREN_yes			.byte 1 # DW_CHILDREN_yes
				.byte 16 # DW_AT_stmt_list
				.byte 23 # DW_FORM_sec_offset
	.byte 0 # EOM(1)			.byte 0 # EOM(1)
	.byte 0 # EOM(2)			.byte 0 # EOM(2)
	.byte 2 # Abbreviation Code			.byte 2 # Abbreviation Code
	.byte 52 # DW_TAG_variable			.byte 52 # DW_TAG_variable
	.byte 0 # DW_CHILDREN_no			.byte 0 # DW_CHILDREN_no
	.byte 3 # DW_AT_name			.byte 3 # DW_AT_name
	.byte 14 # DW_FORM_strp			.byte 14 # DW_FORM_strp
	.byte 63 # DW_AT_external			.byte 63 # DW_AT_external
	.byte 25 # DW_FORM_flag_present			.byte 25 # DW_FORM_flag_present
	.byte 58 # DW_AT_decl_file			.byte 58 # DW_AT_decl_file
	.byte 11 # DW_FORM_data1			.byte 11 # DW_FORM_data1
	.byte 59 # DW_AT_decl_line			.byte 59 # DW_AT_decl_line
	.byte 11 # DW_FORM_data1			.byte 11 # DW_FORM_data1
	.byte 0 # EOM(1)			.byte 0 # EOM(1)
	.byte 0 # EOM(2)			.byte 0 # EOM(2)
	.byte 0 # EOM(3)			.byte 0 # EOM(3)

	.section .debug_str,"MS",@progbits,1			.section .debug_str,"MS",@progbits,1
	.Linfo_string:			.Linfo_string:
	.asciz "sym"			.asciz "sym"
				.Linfo2_string:
				.asciz "sym2"

lld/trunk/test/ELF/no-line-parser-errors-if-empty-section.s

				# REQUIRES: x86

				# LLD uses the debug data to get information for error messages, if possible.
				# However, if the debug line section is empty, we should not attempt to parse
				# it, as that would result in errors from the parser.

				# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o
				# RUN: not ld.lld %t.o -o %t.elf 2>&1 \| FileCheck %s

				# CHECK-NOT: warning:
				# CHECK-NOT: error:
				# CHECK: error: undefined symbol: undefined
				# CHECK-NEXT: {{.*}}.o:(.text+0x1)
				# CHECK-NOT: warning:
				# CHECK-NOT: error:

				.globl _start
				_start:
				callq undefined

				.section .debug_line,"",@progbits

lld/trunk/test/ELF/no-line-parser-errors-if-no-section.s

				# REQUIRES: x86

				# LLD uses the debug data to get information for error messages, if possible.
				# However, if there is no debug line section, we should not attempt to parse
				# it, as that would result in errors from the parser.

				# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o
				# RUN: not ld.lld %t.o -o %t.elf 2>&1 \| FileCheck %s

				# CHECK-NOT: warning:
				# CHECK-NOT: error:
				# CHECK: error: undefined symbol: undefined
				# CHECK-NEXT: {{.*}}.o:(.text+0x1)
				# CHECK-NOT: warning:
				# CHECK-NOT: error:

				.globl _start
				_start:
				callq undefined

lld/trunk/test/ELF/undef.s

	Show All 28 Lines
	# CHECK: error: undefined symbol: zed4			# CHECK: error: undefined symbol: zed4
	# CHECK: >>> referenced by undef-debug.s:7 (dir{{/\|\\}}undef-debug.s:7)			# CHECK: >>> referenced by undef-debug.s:7 (dir{{/\|\\}}undef-debug.s:7)
	# CHECK: >>> {{.*}}.o:(.text.1+0x0)			# CHECK: >>> {{.*}}.o:(.text.1+0x0)

	# CHECK: error: undefined symbol: zed5			# CHECK: error: undefined symbol: zed5
	# CHECK: >>> referenced by undef-debug.s:11 (dir{{/\|\\}}undef-debug.s:11)			# CHECK: >>> referenced by undef-debug.s:11 (dir{{/\|\\}}undef-debug.s:11)
	# CHECK: >>> {{.*}}.o:(.text.2+0x0)			# CHECK: >>> {{.*}}.o:(.text.2+0x0)

				# Show that all line table problems are mentioned as soon as the object's line information
				# is requested, even if that particular part of the line information is not currently required.
				# CHECK: warning: parsing line table prologue at 0x00000000 should have ended at 0x00000038 but it ended at 0x00000037
				# CHECK: warning: last sequence in debug line table is not terminated!
	# CHECK: error: undefined symbol: zed6			# CHECK: error: undefined symbol: zed6
	# CHECK: >>> referenced by {{.*}}tmp4.o:(.text+0x0)			# CHECK: >>> referenced by {{.*}}tmp4.o:(.text+0x0)

				# Show that a problem with one line table's information doesn't affect getting information from
				# a different one in the same object.
				# CHECK: error: undefined symbol: zed7
				# CHECK: >>> referenced by undef-bad-debug2.s:11 (dir2{{/\|\\}}undef-bad-debug2.s:11)
				# CHECK: >>> {{.*}}tmp4.o:(.text+0x8)

	# RUN: not ld.lld %t.o %t2.a -o %t.exe -no-demangle 2>&1 \| \			# RUN: not ld.lld %t.o %t2.a -o %t.exe -no-demangle 2>&1 \| \
	# RUN: FileCheck -check-prefix=NO-DEMANGLE %s			# RUN: FileCheck -check-prefix=NO-DEMANGLE %s
	# NO-DEMANGLE: error: undefined symbol: _Z3fooi			# NO-DEMANGLE: error: undefined symbol: _Z3fooi

	.file "undef.s"			.file "undef.s"

	.globl _start			.globl _start
	_start:			_start:
	call foo			call foo
	call bar			call bar
	call zed1			call zed1
	call _Z3fooi			call _Z3fooi

This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 146109

lld/trunk/ELF/InputFiles.cpp

lld/trunk/test/ELF/Inputs/undef-bad-debug.s

lld/trunk/test/ELF/no-line-parser-errors-if-empty-section.s

lld/trunk/test/ELF/no-line-parser-errors-if-no-section.s

lld/trunk/test/ELF/undef.s

This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 146109

lld/trunk/ELF/InputFiles.cpp

lld/trunk/test/ELF/Inputs/undef-bad-debug.s

lld/trunk/test/ELF/no-line-parser-errors-if-empty-section.s

lld/trunk/test/ELF/no-line-parser-errors-if-no-section.s

lld/trunk/test/ELF/undef.s

[ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)
ClosedPublic