This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
ELF/
-
InputSection.cpp
-
test/ELF/
-
ELF/
-
Inputs/
-
conflict-debug2.s
-
conflict.s
-
lto/
-
combined-lto-object-name.ll
-
undef.s

Differential D27676

[ELF] - Use full object name if source file name exist when reporting errors.
AbandonedPublic

Authored by grimar on Dec 12 2016, 9:08 AM.

Download Raw Diff

Details

Reviewers

ruiu
• rafael

Summary

Previously if we were able to take source name, we used it exclusively for reporting.
PR31354 case shows how confusing it can be:

Error is:
/usr/bin/ld: error: byte_copy.c:(.text+0x0): duplicate symbol 'byte_copy'
/usr/bin/ld: error: byte_copy.c:(function byte_copy): previous definition was here

After this patch output includes archive name if any. What makes clear what the error is about:
lld.exe: error: usr/ports/sysutils/safecat/work/safecat-1.13/byte_copy.o(byte_copy.c):(.text+0x0): duplicate symbol 'byte_copy'
lld.exe: error: usr/ports/sysutils/safecat/work/safecat-1.13/str.a(byte_copy.o)(byte_copy.c):(function byte_copy): previous definition was here

Diff Detail

Event Timeline

grimar updated this revision to Diff 81091.Dec 12 2016, 9:08 AM

grimar retitled this revision from to [ELF] - Use full object name if source file name exist when reporting errors..

grimar updated this object.

grimar added reviewers: ruiu, • rafael.

grimar added subscribers: llvm-commits, grimar, evgeny777.

Isn't it too verbose? Source file names are the thing you need to fix your problems, so it might be better to not print out object files to keep developers from distracting.

With my lld contributor hat off (and with my lld user hat on) I'm 100% in favour of better diagnostics for lld different archive semantics. We've already hit one or two cases in FreeBSD and at least two cases internally where this was an issue. @silvas is very fond of it as I think he was the first one noticing it trying to link a real codebase.

I think my concern is that we are going to print out at most three filenames like this

foo.a(foo.o)(foo.c)

and that error string format looks odd. It is indistinguishable from

foo.o(foo.c)

where in this case foo.o is not an archive but an object file. I think we are using too many parentheses.

In D27676#621721, @ruiu wrote:
I think my concern is that we are going to print out at most three filenames like this
foo.a(foo.o)(foo.c)
and that error string format looks odd. It is indistinguishable from
foo.o(foo.c)
where in this case foo.o is not an archive but an object file. I think we are using too many parentheses.

In the post-commit of r285186 I suggested using a note: to report the object file.

Maybe when we use a source location for the error, we can add a note something like:

note: this definition was found in baz.a(bar.o)

In D27676#621721, @ruiu wrote:
I think my concern is that we are going to print out at most three filenames like this
foo.a(foo.o)(foo.c)
and that error string format looks odd. It is indistinguishable from
foo.o(foo.c)
where in this case foo.o is not an archive but an object file. I think we are using too many parentheses.

Speaking about this one "duplicate symbol" issue, as a user I would probably be happy to see something close to Sean suggestion with additional info.

What about if we have next form here:
/usr/bin/ld: error: byte_copy.c:(.text+0x0): duplicate symbol 'byte_copy'
/usr/bin/ld: error: byte_copy.c:(function byte_copy): previous definition was here
note: 'byte_copy' definitions were found both in usr/ports/sysutils/safecat/work/safecat-1.13/byte_copy.o and usr/ports/sysutils/safecat/work/safecat-1.13/str.a(byte_copy.o),
check that you do not include objects files twice.

One more profit from extended messages is that if something is still not clear, user can always google something like "lld check that you do not include objects files twice error".
Special messages can help here to find issues specific for lld and probably make search results more useful.

In D27676#621926, @grimar wrote:
In D27676#621721, @ruiu wrote:
I think my concern is that we are going to print out at most three filenames like this
foo.a(foo.o)(foo.c)
and that error string format looks odd. It is indistinguishable from
foo.o(foo.c)
where in this case foo.o is not an archive but an object file. I think we are using too many parentheses.
Speaking about this one "duplicate symbol" issue, as a user I would probably be happy to see something close to Sean suggestion with additional info.

What about if we have next form here:
/usr/bin/ld: error: byte_copy.c:(.text+0x0): duplicate symbol 'byte_copy'
/usr/bin/ld: error: byte_copy.c:(function byte_copy): previous definition was here
note: 'byte_copy' definitions were found both in usr/ports/sysutils/safecat/work/safecat-1.13/byte_copy.o and usr/ports/sysutils/safecat/work/safecat-1.13/str.a(byte_copy.o),
check that you do not include objects files twice.

I'm not entirely sure about this diagnostic, and I think the problem is different here (in the general case).
It's not an object file included twice. It's that there are two object files providing a non-weak definition of the same symbol and the order on which you fetch the members from the archive matters.

In D27676#621936, @davide wrote:
In D27676#621926, @grimar wrote:
In D27676#621721, @ruiu wrote:
I think my concern is that we are going to print out at most three filenames like this
foo.a(foo.o)(foo.c)
and that error string format looks odd. It is indistinguishable from
foo.o(foo.c)
where in this case foo.o is not an archive but an object file. I think we are using too many parentheses.
Speaking about this one "duplicate symbol" issue, as a user I would probably be happy to see something close to Sean suggestion with additional info.

What about if we have next form here:
/usr/bin/ld: error: byte_copy.c:(.text+0x0): duplicate symbol 'byte_copy'
/usr/bin/ld: error: byte_copy.c:(function byte_copy): previous definition was here
note: 'byte_copy' definitions were found both in usr/ports/sysutils/safecat/work/safecat-1.13/byte_copy.o and usr/ports/sysutils/safecat/work/safecat-1.13/str.a(byte_copy.o),
check that you do not include objects files twice.
I'm not entirely sure about this diagnostic, and I think the problem is different here (in the general case).
It's not an object file included twice. It's that there are two object files providing a non-weak definition of the same symbol and the order on which you fetch the members from the archive matters.

My wording may be inaccurate. Idea was to place details in note (like suggested by Sean) but also add some reasonable hint about possible reason of issue.

I thought about this for a while. I'd say that having foo.a(foo.o)(foo.c) and foo.o(foo.c) are not correct because they are inconsistent. If the former were foo.a(foo.o(foo.c)), the two are at least consistent, but I guess that's hard to read.

We have a lot of combinations here.

Is an object file in an archive file? Yes or no
Do we know source filename? Yes or not
What do we know about the error location inside of the object file: Line number in the original source code, section name in an object file, or nothing

We don't need to print out all the information we know. For example, if we know a line number, we don't need to print out a section name. However, at the same time, we want to print out in a machine-readable format so that it can be integrated to an IDE.

So, that's a little bit complicated. We probably need to come up with a reasonable rule here.

In D27676#622851, @ruiu wrote:

I thought about this for a while. I'd say that having foo.a(foo.o)(foo.c) and foo.o(foo.c) are not correct because they are inconsistent. If the former were foo.a(foo.o(foo.c)), the two are at least consistent, but I guess that's hard to read.

We have a lot of combinations here.

Is an object file in an archive file? Yes or no

Do we know source filename? Yes or not

What do we know about the error location inside of the object file: Line number in the original source code, section name in an object file, or nothing

I think we can keep the source file / line location information completely separate from the object file location information.

If you think of clang, the error: is often consistent with GCC. When clang provides more information (e.g. "did you mean?", "expanded from macro FOO defined here.", ...) it is usually as a note. Maybe we can be consistent with that: we produce a similar diagnostic to existing linkers, but below each error: line, we can add a handy note:.

For example, look at Clang's output for this:
https://godbolt.org/g/GMFHeq

#line 1 "foo.c"
#define DEFINITION_OF_FOO void foo() {}

DEFINITION_OF_FOO
DEFINITION_OF_FOO

The diagnostic is:

foo.c:4:1: error: redefinition of 'foo'
DEFINITION_OF_FOO
^
foo.c:1:32: note: expanded from macro 'DEFINITION_OF_FOO'
#define DEFINITION_OF_FOO void foo() {}
^
foo.c:3:1: note: previous definition is here
DEFINITION_OF_FOO
^
foo.c:1:32: note: expanded from macro 'DEFINITION_OF_FOO'
#define DEFINITION_OF_FOO void foo() {}
^
1 error generated.

I think we can be consistent with that. The error: redefinition of 'foo' and note: previous definition is here would be basically traditional object location linker diagnostics.

The note: expanded from macro 'DEFINITION_OF_FOO' and note: expanded from macro 'DEFINITION_OF_FOO' would be our extended linker diagnostics providing the original source location. But instead of "expanded from macro" they would be a message like note: source code of definition is here

grimar mentioned this in D27900: [ELF] - Keep the source file/line location information separate from the object file location information..Dec 18 2016, 6:13 AM

In D27676#622869, @silvas wrote:

In D27676#622851, @ruiu wrote:

I thought about this for a while. I'd say that having foo.a(foo.o)(foo.c) and foo.o(foo.c) are not correct because they are inconsistent. If the former were foo.a(foo.o(foo.c)), the two are at least consistent, but I guess that's hard to read.

We have a lot of combinations here.

Is an object file in an archive file? Yes or no

Do we know source filename? Yes or not

What do we know about the error location inside of the object file: Line number in the original source code, section name in an object file, or nothing

I think we can keep the source file / line location information completely separate from the object file location information.

I tried to implement that idea in D27900.

grimar abandoned this revision.Mar 30 2017, 1:31 AM

Revision Contents

Path

Size

ELF/

InputSection.cpp

8 lines

test/

ELF/

Inputs/

conflict-debug2.s

4 lines

conflict.s

8 lines

lto/

combined-lto-object-name.ll

2 lines

undef.s

8 lines

Diff 81091

ELF/InputSection.cpp

	Show First 20 Lines • Show All 215 Lines • ▼ Show 20 Lines
	template <class ELFT>			template <class ELFT>
	std::string InputSectionBase<ELFT>::getLocation(typename ELFT::uint Offset) {			std::string InputSectionBase<ELFT>::getLocation(typename ELFT::uint Offset) {
	// First check if we can get desired values from debugging information.			// First check if we can get desired values from debugging information.
	std::string LineInfo = File->getLineInfo(this, Offset);			std::string LineInfo = File->getLineInfo(this, Offset);
	if (!LineInfo.empty())			if (!LineInfo.empty())
	return LineInfo;			return LineInfo;

	// File->SourceFile contains STT_FILE symbol that contains a			// File->SourceFile contains STT_FILE symbol that contains a
	// source file name. If it's missing, we use an object file name.			// source file name. We add it if exist.
	std::string SrcFile = File->SourceFile;			std::string SrcFile = toString(File);
	if (SrcFile.empty())			if (!File->SourceFile.empty())
	SrcFile = toString(File);			SrcFile += ("(" + File->SourceFile + ")").str();

	// Find a function symbol that encloses a given location.			// Find a function symbol that encloses a given location.
	for (SymbolBody *B : File->getSymbols())			for (SymbolBody *B : File->getSymbols())
	if (auto *D = dyn_cast<DefinedRegular<ELFT>>(B))			if (auto *D = dyn_cast<DefinedRegular<ELFT>>(B))
	if (D->Section == this && D->Type == STT_FUNC)			if (D->Section == this && D->Type == STT_FUNC)
	if (D->Value <= Offset && Offset < D->Value + D->Size)			if (D->Value <= Offset && Offset < D->Value + D->Size)
	return SrcFile + ":(function " + toString(*D) + ")";			return SrcFile + ":(function " + toString(*D) + ")";

	▲ Show 20 Lines • Show All 618 Lines • Show Last 20 Lines

test/ELF/Inputs/conflict-debug2.s

				.file "conflict-debug.s"
				.globl zed
				zed:
				nop

test/ELF/conflict.s

	Show All 28 Lines
	# ARCHIVE-NEXT: {{.*}}1.o:(.text+0x0): previous definition was here			# ARCHIVE-NEXT: {{.*}}1.o:(.text+0x0): previous definition was here

	# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %p/Inputs/conflict-debug.s -o %t-dbg.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %p/Inputs/conflict-debug.s -o %t-dbg.o
	# RUN: not ld.lld %t-dbg.o %t-dbg.o -o %t-dbg 2>&1 \| FileCheck -check-prefix=DBGINFO %s			# RUN: not ld.lld %t-dbg.o %t-dbg.o -o %t-dbg 2>&1 \| FileCheck -check-prefix=DBGINFO %s

	# DBGINFO: conflict-debug.s:4: duplicate symbol 'zed'			# DBGINFO: conflict-debug.s:4: duplicate symbol 'zed'
	# DBGINFO-NEXT: conflict-debug.s:4: previous definition was here			# DBGINFO-NEXT: conflict-debug.s:4: previous definition was here

				# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %p/Inputs/conflict-debug2.s -o %t-dbg2.o
				# RUN: echo "call zed" > %t-dbg1.s
				# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %t-dbg1.s -o %t-dbg1.o
				# RUN: llvm-ar rcs %t-dbg2.a %t-dbg2.o
				# RUN: not ld.lld %t-dbg2.a %t-dbg1.o %t-dbg2.o -o %t2 2>&1 \| FileCheck -check-prefix=ARCHIVE2 %s
				# ARCHIVE2: {{.*}}-dbg2.o(conflict-debug.s):(.text+0x0): duplicate symbol 'zed'
				# ARCHIVE2-NEXT: {{.}}-dbg2.a({{.}}-dbg2.o)(conflict-debug.s):(.text+0x0): previous definition was here

	.globl _Z3muldd, foo			.globl _Z3muldd, foo
	_Z3muldd:			_Z3muldd:
	foo:			foo:
	mov $60, %rax			mov $60, %rax
	mov $42, %rdi			mov $42, %rdi
	syscall			syscall

test/ELF/lto/combined-lto-object-name.ll

	; REQUIRES: x86			; REQUIRES: x86
	; RUN: llvm-as %s -o %t.o			; RUN: llvm-as %s -o %t.o
	; RUN: not ld.lld -m elf_x86_64 %t.o -o %t2 2>&1 \| FileCheck %s			; RUN: not ld.lld -m elf_x86_64 %t.o -o %t2 2>&1 \| FileCheck %s

	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"
	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

	declare void @foo()			declare void @foo()
	define void @_start() {			define void @_start() {
	call void @foo()			call void @foo()
	ret void			ret void
	}			}

	; CHECK: error: ld-temp.o:(function _start): undefined symbol 'foo'			; CHECK: error: lto.tmp(ld-temp.o):(function _start): undefined symbol 'foo'

test/ELF/undef.s

	# REQUIRES: x86			# REQUIRES: x86
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %s -o %t.o
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef.s -o %t2.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef.s -o %t2.o
	# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef-debug.s -o %t3.o			# RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux %p/Inputs/undef-debug.s -o %t3.o
	# RUN: llvm-ar rc %t2.a %t2.o			# RUN: llvm-ar rc %t2.a %t2.o
	# RUN: not ld.lld %t.o %t2.a %t3.o -o %t.exe 2>&1 \| FileCheck %s			# RUN: not ld.lld %t.o %t2.a %t3.o -o %t.exe 2>&1 \| FileCheck %s
	# RUN: not ld.lld -pie %t.o %t2.a %t3.o -o %t.exe 2>&1 \| FileCheck %s			# RUN: not ld.lld -pie %t.o %t2.a %t3.o -o %t.exe 2>&1 \| FileCheck %s
	# CHECK: error: undef.s:(.text+0x1): undefined symbol 'foo'			# CHECK: error: {{.*}}.o(undef.s):(.text+0x1): undefined symbol 'foo'
	# CHECK: error: undef.s:(.text+0x6): undefined symbol 'bar'			# CHECK: error: {{.*}}.o(undef.s):(.text+0x6): undefined symbol 'bar'
	# CHECK: error: undef.s:(.text+0x10): undefined symbol 'foo(int)'			# CHECK: error: {{.*}}.o(undef.s):(.text+0x10): undefined symbol 'foo(int)'
	# CHECK: error: {{.}}2.a({{.}}.o):(.text+0x0): undefined symbol 'zed2'			# CHECK: error: {{.}}2.a({{.}}.o):(.text+0x0): undefined symbol 'zed2'
	# CHECK: error: dir/undef-debug.s:3: undefined symbol 'zed3'			# CHECK: error: dir/undef-debug.s:3: undefined symbol 'zed3'
	# CHECK: error: dir/undef-debug.s:7: undefined symbol 'zed4'			# CHECK: error: dir/undef-debug.s:7: undefined symbol 'zed4'
	# CHECK: error: dir/undef-debug.s:11: undefined symbol 'zed5'			# CHECK: error: dir/undef-debug.s:11: undefined symbol 'zed5'

	# RUN: not ld.lld %t.o %t2.a -o %t.exe -no-demangle 2>&1 \| \			# RUN: not ld.lld %t.o %t2.a -o %t.exe -no-demangle 2>&1 \| \
	# RUN: FileCheck -check-prefix=NO-DEMANGLE %s			# RUN: FileCheck -check-prefix=NO-DEMANGLE %s
	# NO-DEMANGLE: error: undef.s:(.text+0x10): undefined symbol '_Z3fooi'			# NO-DEMANGLE: error: {{.*}}.o(undef.s):(.text+0x10): undefined symbol '_Z3fooi'

	.file "undef.s"			.file "undef.s"

	.globl _start			.globl _start
	_start:			_start:
	call foo			call foo
	call bar			call bar
	call zed1			call zed1
	call _Z3fooi			call _Z3fooi

This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Use full object name if source file name exist when reporting errors.AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 81091

ELF/InputSection.cpp

test/ELF/Inputs/conflict-debug2.s

test/ELF/conflict.s

test/ELF/lto/combined-lto-object-name.ll

test/ELF/undef.s

[ELF] - Use full object name if source file name exist when reporting errors.
AbandonedPublic