Download Raw Diff

Details

Reviewers

ruiu
• espindola

Commits

rGd621037788c3: [ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)
rLLD331972: [ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)
rL331972: [ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)

Summary

Feedback on D44382 suggested to change the interface to use a callback instead of the structures used. As I didn't want to effectively overwrite that diff, I created D44560 as an alternative implementation using the suggested approach. This is the corresponding patch required for LLD assuming D44560 were to be applied.

As with D44382, D44560 changes the debug line parser interface to report LLVM errors in an interface that different executables can use, rather than always being printed directly as warnings to stderr. This change allows LLD to make use of the new interface and call its own warning methods to report problems.

To test this, I have extended the bad-debug undefined symbol message case to show that a corresponding warning is printed, if the debug line cannot be parsed. In addition, I have also added tests for LLD attempting to parse a non-existent/empty debug line section, showing that the new warning for attempting to do this is not emitted.

Diff Detail

Repository: rLLD LLVM Linker

Event Timeline

jhenderson created this revision.Mar 16 2018, 6:54 AM

jhenderson mentioned this in D44560: [DWARF] Rework debug line parsing to use llvm::Error and callbacks.

jhenderson mentioned this in D44388: [ELF] Rework debug line parsing to use llvm::Error (LLD-side).Mar 19 2018, 2:26 AM

grimar added inline comments.Mar 19 2018, 2:46 AM

ELF/InputFiles.cpp
128	I would move this comment above `isValidOffset(0)` and update it as you are doing something with offset 0 right there already.
136	Is it useful to `warn` here? I think we only call this method on linkage error, so anyways going to terminate the link. With that, it seems easier to always error out here to simplify code/logic.

jhenderson added inline comments.Mar 19 2018, 3:28 AM

ELF/InputFiles.cpp
128	Actually, the comment (and code) is technically wrong - it is possible for there to be multiple CUs in a single object input. I'm thinking objects built via LTO or -r links. I'll add a FIXME and file a bug, along with updating the comment.
136	I'm currently preserving behaviour (or fixing intended behaviour, at least), though I certainly see where you're coming from. In general we try to emit as many errors/warnings as we can before stopping, so I still think it's useful - it might help the user identify some other problem, for example. I'm not bothered either way though, so if the consensus is to not emit the warning, I can easily change that.

grimar added inline comments.Mar 19 2018, 3:33 AM

ELF/InputFiles.cpp
136	I meant we always can `error` here instead of `warn` for all cases I think.

grimar added inline comments.Mar 19 2018, 3:34 AM

ELF/InputFiles.cpp
136	As at this point, we are already in error state.

Update comment.

ELF/InputFiles.cpp
136	Right, but that would suggest to the user that fixing the malformed debug line is a requirement to get a working link, which it isn't. Perhaps another way to think about it is if we wanted to introduce a new warning that could use source information. We'd want to use this function too, but we shouldn't stop the link succeeding if we can't parse the line table. I could actually make a case for making the second one a warning too, but I made that an error, because we don't actually ever expect it to be reported currently.

JDevlieghere added a subscriber: JDevlieghere.Mar 19 2018, 8:21 AM

• espindola added inline comments.Mar 19 2018, 2:24 PM

test/ELF/undef.s
37	I think I agree with George about producing an error on corrupted output. In practice we expect to never see it, so we may as well make the code simpler.

• espindola added inline comments.Mar 19 2018, 2:25 PM

ELF/InputFiles.cpp
130	When do we expect to have LineData that doesn't include the current CU?

David Blaikie via llvm-commits <llvm-commits@lists.llvm.org> writes:

@rafael on the mailing list:

Now that I think of it, the fact that lld could not parse the debug info to provide a better error message should always be a note, not an error, regardless of how broken the section happens to be.

To be clear, are you saying you're happy with the warnings for now, or would you prefer to drop the severity to just use "message()"? Also, what do you think about the generic error type which currently emits an error? I'd be happy to change the severity of that one too. It's currently an error because we don't expect an Error of that type ever to be reported.

ELF/InputFiles.cpp
130	LineData contains the contents of the .debug_line section. It is possible to have an empty section, or a missing section, in which case there will be no valid contents in LineData. Previously, when LLD requested parsing of this section, the parser would return false immediately, because the offset was invalid. With the changes in D44560, an Error is now returned saying the offset is invalid. In addition, since an object file can contain multiple CUs (e.g. via LTO or -r links), some of which might be missing debug data and others not, we can't know for certain using this method that offset 0 is the offset of the CU for the corresponding symbol (see the reproducible in PR36793).

In D44562#1042990, @jhenderson wrote:

@rafael on the mailing list:

Now that I think of it, the fact that lld could not parse the debug info to provide a better error message should always be a note, not an error, regardless of how broken the section happens to be.

To be clear, are you saying you're happy with the warnings for now, or would you prefer to drop the severity to just use "message()"? Also, what do you think about the generic error type which currently emits an error? I'd be happy to change the severity of that one too. It's currently an error because we don't expect an Error of that type ever to be reported.

Using message() in all cases seems better. We know the link is failing for unrelated reasons and we are just letting the user know why we are not printing better error messages (could not parse the debug info).

In D44562#1043904, @espindola wrote:

Using message() in all cases seems better. We know the link is failing for unrelated reasons and we are just letting the user know why we are not printing better error messages (could not parse the debug info).

message() prints to stdout, which I hadn't realised earlier. This clearly should be printed to stderr, for which there is no mechanism currently, if we don't want a warning or error. Thinking about it more, the rest of LLVM has a concept of "remarks" which are "lower" severity messages than errors or warnings (and can't be promoted to errors, as I understand it). How about I add this to LLD? Alternatively, I'll need to add a note() function, or just print to errs() directly.

Add a "remark" method to the LLD ErrorHandler, and use that instead of warnings to report problems with the debug line. Also added another case to the testing, to demonstrate failures via the callback in contrast to failures via the return value.

Tentative LGTM pending the llvm change.

This revision is now accepted and ready to land.Mar 26 2018, 6:52 PM

Thanks @espindola. I'll still need to rebase this once I've figured out what to do following rLLD328284 (see also my comments in D44560). I don't anticipate this changing from using remark() though. It's just how that function gets called.

Rebase following rLLD328284. I was able to drop one of the test inputs, now that LLD can handle multiple CUs in the same input file. Also added a comment to the test explaining the purpose of the zed6/zed7 test cases.

This revision is now accepted and ready to land.Mar 28 2018, 7:01 AM

@espindola/@ruiu, are you happy with the latest update?

ruiu added inline comments.Mar 28 2018, 1:54 PM

Common/ErrorHandler.cpp
86–91 ↗	(On Diff #140069)	Honestly I don't think we want to add a new level of error messages, as I think error() for errors, warning() for warnings and message() for non-error messages is enough. If something needs to be fixed, it should be an error or a warning. If something doesn't have to be fixed, lld shouldn't print out anything. A situation like "verbose messages are printed out but you can ignore them" isn't actionable and thus not desirable. So, can you choose either (1) show it as a warning or (2) don't show anything?

jhenderson added inline comments.Mar 29 2018, 1:42 AM

Common/ErrorHandler.cpp
86–91 ↗	(On Diff #140069)	Okay, I have no problem making it a warning. I think this might be a safer course than not printing anything. If undefined symbol messages are incomplete, due to a problem in the input, it might be possible for users to fix it, depending on the nature of the problem (e.g. bogus hand-written debug_line assembly), and therefore allowing them to see the useful error information, allowing them to more quickly identify the problem. I could also see us potentially using this function in the future to produce other informational messages, or implement other features, without always emitting an error, so a warning would be appropriate here too.

Change back to using warnings.

LGTM. I will review the LLVM side too.

This revision is now accepted and ready to land.Mar 29 2018, 3:38 PM

jhenderson edited the summary of this revision. (Show Details)May 10 2018, 3:52 AM

jhenderson removed a subscriber: • rafael.

Closed by commit rL331972: [ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side) (authored by jhenderson). · Explain WhyMay 10 2018, 3:56 AM

This revision was automatically updated to reflect the committed changes.

Diff 138892

ELF/InputFiles.cpp

	Show First 20 Lines • Show All 115 Lines • ▼ Show 20 Lines

	template <class ELFT> void ObjFile<ELFT>::initializeDwarf() {			template <class ELFT> void ObjFile<ELFT>::initializeDwarf() {
	DWARFContext Dwarf(make_unique<LLDDwarfObj<ELFT>>(this));			DWARFContext Dwarf(make_unique<LLDDwarfObj<ELFT>>(this));
	const DWARFObject &Obj = Dwarf.getDWARFObj();			const DWARFObject &Obj = Dwarf.getDWARFObj();
	DwarfLine.reset(new DWARFDebugLine);			DwarfLine.reset(new DWARFDebugLine);
	DWARFDataExtractor LineData(Obj, Obj.getLineSection(), Config->IsLE,			DWARFDataExtractor LineData(Obj, Obj.getLineSection(), Config->IsLE,
	Config->Wordsize);			Config->Wordsize);

	// The second parameter is offset in .debug_line section			const DWARFDebugLine::LineTable *LT = nullptr;
	// for compilation unit (CU) of interest. We have only one			// The second parameter in getOrParseLineTable is the offset in the
	// CU (object file), so offset is always 0.			// .debug_line section for the compilation unit (CU) of interest. We assume we
	const DWARFDebugLine::LineTable *LT =			// have only one CU (object file), so the offset is always 0.
	DwarfLine->getOrParseLineTable(LineData, 0, Dwarf, nullptr);			// FIXME: Under some circustances, we can have more than one CU, e.g. objects
				grimarUnsubmitted Done Reply Inline Actions I would move this comment above `isValidOffset(0)` and update it as you are doing something with offset 0 right there already. grimar: I would move this comment above `isValidOffset(0)` and update it as you are doing something…
				jhendersonAuthorUnsubmitted Not Done Reply Inline Actions Actually, the comment (and code) is technically wrong - it is possible for there to be multiple CUs in a single object input. I'm thinking objects built via LTO or -r links. I'll add a FIXME and file a bug, along with updating the comment. jhenderson: Actually, the comment (and code) is technically wrong - it is possible for there to be multiple…
				// built via -r links or LTO. See PR36793.
				if (LineData.isValidOffset(0)) {
				espindolaUnsubmitted Not Done Reply Inline Actions When do we expect to have LineData that doesn't include the current CU? espindola: When do we expect to have LineData that doesn't include the current CU?
				jhendersonAuthorUnsubmitted Not Done Reply Inline Actions LineData contains the contents of the .debug_line section. It is possible to have an empty section, or a missing section, in which case there will be no valid contents in LineData. Previously, when LLD requested parsing of this section, the parser would return false immediately, because the offset was invalid. With the changes in D44560, an Error is now returned saying the offset is invalid. In addition, since an object file can contain multiple CUs (e.g. via LTO or -r links), some of which might be missing debug data and others not, we can't know for certain using this method that offset 0 is the offset of the CU for the corresponding symbol (see the reproducible in PR36793). jhenderson: LineData contains the contents of the .debug_line section. It is possible to have an empty…
				Expected<const DWARFDebugLine::LineTable *> ErrOrLineTable =
				DwarfLine->getOrParseLineTable(LineData, 0, Dwarf, nullptr, warn);
				if (ErrOrLineTable)
				LT = *ErrOrLineTable;
				else
				handleAllErrors(ErrOrLineTable.takeError(),
				grimarUnsubmitted Not Done Reply Inline Actions Is it useful to `warn` here? I think we only call this method on linkage error, so anyways going to terminate the link. With that, it seems easier to always error out here to simplify code/logic. grimar: Is it useful to `warn` here? I think we only call this method on linkage error, so anyways…
				jhendersonAuthorUnsubmitted Not Done Reply Inline Actions I'm currently preserving behaviour (or fixing intended behaviour, at least), though I certainly see where you're coming from. In general we try to emit as many errors/warnings as we can before stopping, so I still think it's useful - it might help the user identify some other problem, for example. I'm not bothered either way though, so if the consensus is to not emit the warning, I can easily change that. jhenderson: I'm currently preserving behaviour (or fixing intended behaviour, at least), though I certainly…
				grimarUnsubmitted Not Done Reply Inline Actions I meant we always can `error` here instead of `warn` for all cases I think. grimar: I meant we always can `error` here instead of `warn` for all cases I think.
				grimarUnsubmitted Not Done Reply Inline Actions As at this point, we are already in error state. grimar: As at this point, we are already in error state.
				jhendersonAuthorUnsubmitted Not Done Reply Inline Actions Right, but that would suggest to the user that fixing the malformed debug line is a requirement to get a working link, which it isn't. Perhaps another way to think about it is if we wanted to introduce a new warning that could use source information. We'd want to use this function too, but we shouldn't stop the link succeeding if we can't parse the line table. I could actually make a case for making the second one a warning too, but I made that an error, because we don't actually ever expect it to be reported currently. jhenderson: Right, but that would suggest to the user that fixing the malformed debug line is a requirement…
				[&](DebugLineError &Err) { warn(Err.message()); },
				[&](ErrorInfoBase &Err) { error(Err.message()); });
				}
	if (!LT)			if (!LT)
	return;			return;

	// Return if there is no debug information about CU available.			// Return if there is no debug information about CU available.
	if (!Dwarf.getNumCompileUnits())			if (!Dwarf.getNumCompileUnits())
	return;			return;

	// Loop over variable records and insert them to VariableLoc.			// Loop over variable records and insert them to VariableLoc.
	▲ Show 20 Lines • Show All 1,113 Lines • Show Last 20 Lines

test/ELF/Inputs/undef-bad-debug.s

	.section .text,"ax"			.section .text,"ax"
	sym:			sym:
	.quad zed6			.quad zed6

				.section .debug_line,"",@progbits
				.long .Lunit_end - .Lunit_start # unit length
				.Lunit_start:
				.short 4 # version
				.long .Lprologue_end - .Lprologue_start # prologue length
				.Lprologue_start:
				.byte 1 # minimum instruction length
				.byte 1 # maximum operatiosn per instruction
				.byte 1 # default is_stmt
				.byte -5 # line base
				.byte 14 # line range
				.byte 13 # opcode base
				.byte 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1 # standard opcode lengths
				.asciz "dir" # include directories
				.byte 0
				.asciz "undef-bad-debug.s" # file names
				.byte 1, 0, 0
				.byte 0
				.byte 0 # extraneous byte
				.Lprologue_end:
				.byte 0, 9, 2 # DW_LNE_set_address
				.quad sym
				.byte 3 # DW_LNS_advance_line
				.byte 10
				.byte 1 # DW_LNS_copy
				.byte 2 # DW_LNS_advance_pc
				.byte 8
				.byte 0, 1, 1 # DW_LNE_end_sequence
				.Lunit_end:

	.section .debug_info,"",@progbits			.section .debug_info,"",@progbits
	.long .Lcu_end - .Lcu_start # Length of Unit			.long .Lcu_end - .Lcu_start # Length of Unit
	.Lcu_start:			.Lcu_start:
	.short 4 # DWARF version number			.short 4 # DWARF version number
	.long .Lsection_abbrev # Offset Into Abbrev. Section			.long .Lsection_abbrev # Offset Into Abbrev. Section
	.byte 8 # Address Size (in bytes)			.byte 8 # Address Size (in bytes)
	.byte 1 # Abbrev [1] 0xb:0x79 DW_TAG_compile_unit			.byte 1 # Abbrev [1] 0xb:0x79 DW_TAG_compile_unit
	.byte 2 # Abbrev [2] 0x2a:0x15 DW_TAG_variable			.byte 2 # Abbrev [2] 0x2a:0x15 DW_TAG_variable
	Show All 32 Lines

test/ELF/no-line-parser-errors-if-empty-section.s

				# REQUIRES: x86

				# LLD uses the debug data to get information for error messages, if possible.
				# However, if the debug line section is empty, we should not attempt to parse
				# it, as that would result in errors from the parser.

				# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o
				# RUN: not ld.lld %t.o -o %t.elf 2>&1 \| FileCheck %s

				# CHECK-NOT: warning:
				# CHECK-NOT: error:
				# CHECK: error: undefined symbol: undefined
				# CHECK-NEXT: {{.*}}.o:(.text+0x1)
				# CHECK-NOT: warning:
				# CHECK-NOT: error:

				.globl _start
				_start:
				callq undefined

				.section .debug_line,"",@progbits

test/ELF/no-line-parser-errors-if-no-section.s

				# REQUIRES: x86

				# LLD uses the debug data to get information for error messages, if possible.
				# However, if there is no debug line section, we should not attempt to parse
				# it, as that would result in errors from the parser.

				# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o
				# RUN: not ld.lld %t.o -o %t.elf 2>&1 \| FileCheck %s

				# CHECK-NOT: warning:
				# CHECK-NOT: error:
				# CHECK: error: undefined symbol: undefined
				# CHECK-NEXT: {{.*}}.o:(.text+0x1)
				# CHECK-NOT: warning:
				# CHECK-NOT: error:

				.globl _start
				_start:
				callq undefined

test/ELF/undef.s

	Show All 28 Lines
	# CHECK: error: undefined symbol: zed4			# CHECK: error: undefined symbol: zed4
	# CHECK: >>> referenced by undef-debug.s:7 (dir{{/\|\\}}undef-debug.s:7)			# CHECK: >>> referenced by undef-debug.s:7 (dir{{/\|\\}}undef-debug.s:7)
	# CHECK: >>> {{.*}}.o:(.text.1+0x0)			# CHECK: >>> {{.*}}.o:(.text.1+0x0)

	# CHECK: error: undefined symbol: zed5			# CHECK: error: undefined symbol: zed5
	# CHECK: >>> referenced by undef-debug.s:11 (dir{{/\|\\}}undef-debug.s:11)			# CHECK: >>> referenced by undef-debug.s:11 (dir{{/\|\\}}undef-debug.s:11)
	# CHECK: >>> {{.*}}.o:(.text.2+0x0)			# CHECK: >>> {{.*}}.o:(.text.2+0x0)

				# CHECK: warning: parsing line table prologue at 0x00000000 should have ended at 0x00000038 but it ended at 0x00000037
				espindolaUnsubmitted Not Done Reply Inline Actions I think I agree with George about producing an error on corrupted output. In practice we expect to never see it, so we may as well make the code simpler. espindola: I think I agree with George about producing an error on corrupted output. In practice we expect…
	# CHECK: error: undefined symbol: zed6			# CHECK: error: undefined symbol: zed6
	# CHECK: >>> referenced by {{.*}}tmp4.o:(.text+0x0)			# CHECK: >>> referenced by {{.*}}tmp4.o:(.text+0x0)

	# RUN: not ld.lld %t.o %t2.a -o %t.exe -no-demangle 2>&1 \| \			# RUN: not ld.lld %t.o %t2.a -o %t.exe -no-demangle 2>&1 \| \
	# RUN: FileCheck -check-prefix=NO-DEMANGLE %s			# RUN: FileCheck -check-prefix=NO-DEMANGLE %s
	# NO-DEMANGLE: error: undefined symbol: _Z3fooi			# NO-DEMANGLE: error: undefined symbol: _Z3fooi

	.file "undef.s"			.file "undef.s"

	.globl _start			.globl _start
	_start:			_start:
	call foo			call foo
	call bar			call bar
	call zed1			call zed1
	call _Z3fooi			call _Z3fooi

This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 138892

ELF/InputFiles.cpp

test/ELF/Inputs/undef-bad-debug.s

test/ELF/no-line-parser-errors-if-empty-section.s

test/ELF/no-line-parser-errors-if-no-section.s

test/ELF/undef.s

This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 138892

ELF/InputFiles.cpp

test/ELF/Inputs/undef-bad-debug.s

test/ELF/no-line-parser-errors-if-empty-section.s

test/ELF/no-line-parser-errors-if-no-section.s

test/ELF/undef.s

[ELF] Rework debug line parsing to use llvm::Error and callbacks (LLD-side)
ClosedPublic