Page MenuHomePhabricator

Add a feature to explain why some file gets included to the linker's output
Needs ReviewPublic

Authored by ruiu on Oct 30 2019, 12:33 AM.

Details

Summary

This patch proposes a new option --explain. This patch is not intended to be
committed as-is but is for discussion.

So, I sent https://reviews.llvm.org/D67388 to dump the internal dependency graph
from the linker so that users can run arbitrary graph algorithms to analyze
linker outputs. But I believe in most cases what users want to know is simple:
why some file, that wasn't previously linked, is now included to the final
binary? I think this situation occurs so frequently that we probably should add
a new feature that answers to that particular question, so that users don't have
to write a graph analysis program.

This is an example output of lld when --explain=lib/libLLVMSupport.a(APInt.cpp.o)
is given (shortened to fit to the screen).

This is why 'lib/libLLVMSupport.a(APInt.cpp.o)' is linked:

'(--entry option)' uses '_start' defined in '/usr/lib/x86_64-linux-gnu/crt1.o'
which uses 'main' defined in 'tools/lld/tools/lld/CMakeFiles/lld.dir/lld.cpp.o'
which uses 'StringRef::endswith_lower()' defined in 'lib/libLLVMSupport.a(StringRef.cpp.o)'
which uses 'APInt::zext()' defined in 'lib/libLLVMSupport.a(APInt.cpp.o)'

What we are doing in this patch is the basic breadth-first search in the
dependency graph.

The feature implemented in this patch is somewhat redundant once we land
https://reviews.llvm.org/D67388, but this option seems pretty practical and easy
to use. So, I guess that adding something like this would make users life a bit
easier.

Event Timeline

ruiu created this revision.Oct 30 2019, 12:33 AM
Herald added a project: Restricted Project. · View Herald Transcript

Looks useful.

lld/ELF/Explain.cpp
92

Perhaps worth to show a list of files available for --explain to user on this step?
Or have another way to get a list of files that can be passed here. May be explain without arguments could show something.

I agree that having the most common case built in is valuable. Unless there is an analysis program that is shipped with the linker I suspect that the majority of users won't want to write their own algorithm.
Looking at the example output:

'(--entry option)' uses '_start'
'/usr/lib/x86_64-linux-gnu/crt1.o' uses 'main'
'lld.cpp.o' uses 'lld::elf::link(llvm::ArrayRef<char const*>)'
'lib/liblldELF.a(Driver.cpp.o)' uses 'llvm::object::Archive::create()'
'lib/libLLVMObject.a(Archive.cpp.o)' uses 'llvm::Expected<bool>::operator bool()'
'lib/libLLVMPasses.a(PassBuilder.cpp.o)' uses 'llvm::InlinerPass::~InlinerPass()'
'lib/libLLVMipo.a(Inliner.cpp.o)'

I'd be tempted to start with the last object in the chain that is unconditionally present on the command line. For example _start, crt1.o, etc. will always be present and most likely not visible to the user as they have been added by the compiler driver. In this case I'd expect lld.cpp.o to be the first entry in the diagnostic.

Although it is implicit, it may be worth stating the dependency as:

'lld.cpp.o' uses 'lld::elf::link(llvm::ArrayRef<char const*>)' defined in 'lib/liblldELF.a(Driver.cpp.o)'

It would make it a bit more verbose, but everything is on one line.

I think this is a cool feature. Could it accept symbol names in addition to files as well? ld64 has a "-why_live" for symbols, and a "-why_load" for object files (https://www.manpagez.com/man/1/ld64/). I've used why_live a lot more than why_load.

joerg added a subscriber: joerg.Oct 30 2019, 7:13 AM

Just for the sake of correctness, this is provided by GNU ld as part of the map file (-Wl,-Map=foo.map). It lists all archive members and which object file and symbol triggered the inclusion.

pcc added a comment.Oct 30 2019, 9:42 AM

I agree with Nico that it would be nice if this could take symbol names as well. Maybe the --explain output could be added to the output of --trace`?

lld/ELF/Explain.cpp
96

Hmm, object files aren't really gc roots, so this could give misleading output (e.g. ld a.o b.o --explain a with a.o containing a: ret and b.o containing b: call a with only b in dynsym won't report b.o). Also, it misses archives included with --whole-archive.

Shouldn't this be more similar to MarkLive.cpp? In other words, include dynsym and don't treat object files as roots.

126

Shouldn't it also look at dependentSections?

168

I don't think the isWeak part here is correct.

MaskRay added inline comments.Oct 30 2019, 11:30 PM
lld/ELF/Explain.cpp
102

--init and --fini are used in --gc-sections but they do not fetch lazy definitions. I do not know why GNU ld behaves this way.

158

We can make edges a superset of specialEdges, then we can just delete seen and use edges.try_emplace(...) here.

168

-u/-e can fetch a weak lazy definition. This may cause a bogus file unreachable error.

MaskRay added inline comments.Oct 30 2019, 11:33 PM
lld/ELF/Driver.cpp
1990

Missing full stop.

lld/ELF/Explain.h
12

Missing #include "lld/Common/LLVM.h"

MaskRay added inline comments.Oct 30 2019, 11:37 PM
lld/ELF/Options.td
46

It is not obvious that the user is supposed to specify b.a(b.o) for an object file in an archive or b.o for an object between --start-lib and --end-lib.

ruiu marked 4 inline comments as done.Oct 30 2019, 11:47 PM

I agree that having the most common case built in is valuable. Unless there is an analysis program that is shipped with the linker I suspect that the majority of users won't want to write their own algorithm.
Looking at the example output:

'(--entry option)' uses '_start'
'/usr/lib/x86_64-linux-gnu/crt1.o' uses 'main'
'lld.cpp.o' uses 'lld::elf::link(llvm::ArrayRef<char const*>)'
'lib/liblldELF.a(Driver.cpp.o)' uses 'llvm::object::Archive::create()'
'lib/libLLVMObject.a(Archive.cpp.o)' uses 'llvm::Expected<bool>::operator bool()'
'lib/libLLVMPasses.a(PassBuilder.cpp.o)' uses 'llvm::InlinerPass::~InlinerPass()'
'lib/libLLVMipo.a(Inliner.cpp.o)'

I'd be tempted to start with the last object in the chain that is unconditionally present on the command line. For example _start, crt1.o, etc. will always be present and most likely not visible to the user as they have been added by the compiler driver. In this case I'd expect lld.cpp.o to be the first entry in the diagnostic.

Although it is implicit, it may be worth stating the dependency as:

'lld.cpp.o' uses 'lld::elf::link(llvm::ArrayRef<char const*>)' defined in 'lib/liblldELF.a(Driver.cpp.o)'

It would make it a bit more verbose, but everything is on one line.

How about the new one? I think I can reverse the order by making the message in a passive tense, but it seems that this order is a bit easier to read.

lld/ELF/Explain.cpp
92

Added a hint as to how to get a list of input files.

96

I think object files are actually roots in this features sense if you do not pass -gc-sections, so we need to handle two different cases, -no-gc-sections and -gc-sections. How is this new code?

126

Done.

168

Yeah, this was in a wrong place. I moved this to enqueue.

ruiu updated this revision to Diff 227227.Oct 30 2019, 11:47 PM
  • address review comments
ruiu edited the summary of this revision. (Show Details)Oct 30 2019, 11:56 PM
ruiu marked 5 inline comments as done.Oct 31 2019, 12:11 AM
ruiu added inline comments.
lld/ELF/Explain.cpp
158

We probably could but looks like it also makes sense to keep them distinctive sets, so I'm leaning towards not doing that.

ruiu updated this revision to Diff 227231.Oct 31 2019, 12:12 AM
  • address vreview comments
MaskRay added inline comments.Oct 31 2019, 12:17 AM
lld/ELF/Explain.cpp
157

We may consider turning the error into a regular print. A similar functionality -y not_exist does not error.

ruiu marked an inline comment as done.Oct 31 2019, 12:29 AM
ruiu added inline comments.
lld/ELF/Explain.cpp
158

It's not a strong preference, but error is perhaps better than printing it out as a non-error message? When the control reaches here, something is wrong from the user's perspective.

ruiu edited the summary of this revision. (Show Details)Oct 31 2019, 12:35 AM
grimar added inline comments.Oct 31 2019, 1:54 AM
lld/ELF/Explain.cpp
176

But it doesn't seem helps for the case mentioned in the description?

If we have a main.o compiled from

.global _start;
_start:
callq foo

And a foo.a which contains a foo.o compiled from

.globl foo;
foo:

Then when I invoke -flavor gnu main.o foo.a "-o" out --verbose I see:

lld: main.o
lld: foo.a

Then, like a possible normal user :) I do:
-flavor gnu --explain=foo.a main.o foo.a "-o" out --verbose

and I see:

lld: main.o
lld: foo.a
lld: error: --explain: no such file or symbol: 'foo.a'. Use --verbose option to
see a list of input files.

How I am supposed to realise that I should invoke -flavor gnu --explain=foo.a(foo.o) main.o foo.a "-o" out --verbose
to see the following?

lld: main.o
lld: foo.a
Explain: This is why 'foo.a(foo.o)' is linked:
Explain:
Explain: '(--entry option)' uses '_start' defined in 'main.o'
Explain: which uses 'foo' defined in 'foo.a(foo.o)'
grimar added inline comments.Oct 31 2019, 1:57 AM
lld/ELF/Explain.cpp
176

btw, note that it says '(--entry option)' uses '_start', but in fact I am not using any command line option like -e

ruiu marked 2 inline comments as done.Oct 31 2019, 2:09 AM
ruiu added inline comments.
lld/ELF/Explain.cpp
176

That's I think fine. It's an implicit option but still there.

176

Well, I don't think we should print out all section names because it's just too long, and I believe showing a hint is better. The problem is that --verbose option doesn't show archive members, but that can be fixed simply by adding a log() call to fetch().

ruiu updated this revision to Diff 227243.Oct 31 2019, 2:10 AM
  • address review comments

Thanks for the update, I'm happy with the explain output format.

One anomaly that I'm not sure is worth dealing with is the linkerscript GROUP command as used in libc.so

GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a  AS_NEEDED ( /lib/ld-linux-armhf.so.3 ) )

I think that these will just end up as input files on the command line. Although they are non-obvious and implicitly included due to the linker script.

lld/ELF/Explain.cpp
169

Global symbols defined by a comdat group may cause some confusion as there will be multiple object files that define the symbol, but only one group that is selected that will match here. Not a lot that we can do here. Perhaps if we knew that a symbol was defined in a group then we could print that out. "Explain: Symbol foo is defined in COMDAT group g, selected from from file foo.o". Perhaps wait and see if anyone does get confused.

lld/ELF/Options.td
47

A suggestion:

Explain why a given object file gets linked in to the final binary. Objects can be specified by full path, or by a global symbol that they define. Objects defined in archives are specified by full/path/to/library(object) such as library.a(object.o).
MaskRay added inline comments.Oct 31 2019, 3:21 PM
lld/ELF/Explain.cpp
158

error inhibits producing an output. I have a feeling that retaining the output may be slightly more useful.

Sometimes the user can specify multiple --explain. The user may still need the output though some of the --explain values are not existent.

169

Explaining more about COMDAT groups looks good to me. We essentially discarded all but one definitions, lost some information in the transformation, and the best we can do here is to select the prevailing definition. The user is hinted from the message that the archive order may matter.

A COMDAT group is a SHT_GROUP section with the ELF group section flag GRP_COMDAT. Its name is almost always ".group". Shall we say the COMDAT group with signature "g"?

176

Is --trace better? --verbose suggests the raw archive names while --trace does not, and the --trace output does not include unused files.

# --verbose
ld.lld: a.a
ld.lld: b.o
ld.lld: a.a(a.o)
# --trace
b.o
a.a(a.o)
lld/ELF/Options.td
47

Unfortunately this (lld::elf::InputFile::archiveName) is not the full path. Here are some examples:

ld.lld a.a => a.a
ld.lld -L. -la => ./liba.a
ld.lld /tmp/c/a.a => /tmp/c/a.a

So we may need path normalization here. Some users may resolve the path and feed that to lld.

ruiu updated this revision to Diff 227410.Nov 1 2019, 3:18 AM
  • added tests
  • added a man page entry
  • removed dependent section tracking because dependencies are within the same object file, while this explain feature works in the file granularity
ruiu marked 5 inline comments as done.Nov 1 2019, 3:49 AM

Thanks for the update, I'm happy with the explain output format.

One anomaly that I'm not sure is worth dealing with is the linkerscript GROUP command as used in libc.so

GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a  AS_NEEDED ( /lib/ld-linux-armhf.so.3 ) )

I think that these will just end up as input files on the command line. Although they are non-obvious and implicitly included due to the linker script.

In most cases I believe linker scripts are explicitly added to the command line by users, and that wouldn't cause too much confusion as they should know what they are doing. One exception is Linux's libc.so which is not a DSO but actually a linker script, but maybe that's the only implicit use case?

lld/ELF/Explain.cpp
158

I guess that's actually a good thing, no? I want users to use this option only interactively just by appending an --explain to an otherwise complete command line, and if a command line option passed by a user is not correct, that is, well, I think an error.

169

Currently it is not easy to know whether a section was in a comdat group or not. Fortunately, I don't think that's too confusing, at least we don't tell a lie to users. If we report that there's some dependency from one file to another, that dependency does exist in the source code, though other files may have similar dependencies to the same file. So I don't think we need to do something special for comdats.

176

Thank you for the suggestion. I replaced --verbose with --trace.

lld/ELF/Options.td
47

Added that text to the man page, thanks!

47

OK, I made a change to code so that we normalize pathnames before comparison.

ruiu updated this revision to Diff 227412.Nov 1 2019, 3:50 AM
  • address review comments
grimar added a comment.Nov 1 2019, 3:52 AM

This needs a test case.

ruiu updated this revision to Diff 227415.Nov 1 2019, 3:59 AM
  • add a test file which I forgot to add in a previous patch
pcc added a comment.Nov 1 2019, 9:02 AM

removed dependent section tracking because dependencies are within the same object file, while this explain feature works in the file granularity

Is it a good idea to make the feature use file granularity in the --gc-sections case? I would imagine that this feature would be used in tricky situations where the explanation for a section being included is not entirely obvious, so doing things at a file granularity may end up hiding critical information. That's just my opinion though.

lld/ELF/Explain.cpp
96

Maybe I'm missing something, but I don't see where dynsyms are being added in the new code in the --gc-sections case,

MaskRay added inline comments.Nov 5 2019, 9:16 AM
lld/ELF/Explain.cpp
96

We don't need to handle --init, --fini or .dynsym as opposed to MarkLive.cpp.

# c.s -> c.o
.globl _start
_start:
  ret

.section .text.foo,"ax"
call bar

# b.s -> b.o -> b.a
% cat b.s
.globl bar
bar:
  call foo

# a.s -> a.o -> a.a
.globl foo
foo:
  ret

For example, in ld.lld c.o b.a a.a --gc-sections -o a, neither b.a nor a.a is necessary (none of their sections is retained) but we still need to link in b.a and a.a before discarding their sections. The archive handling logic happens before garbage collection.

I think this is a cool feature. Could it accept symbol names in addition to files as well? ld64 has a "-why_live" for symbols, and a "-why_load" for object files (https://www.manpagez.com/man/1/ld64/). I've used why_live a lot more than why_load.

@thakis The current implementation uses the option name --explain and allows either a symbol name and a filename pattern. Do you have a preference of using an alternative name, or using different option names?

I think this is a cool feature. Could it accept symbol names in addition to files as well? ld64 has a "-why_live" for symbols, and a "-why_load" for object files (https://www.manpagez.com/man/1/ld64/). I've used why_live a lot more than why_load.

@thakis The current implementation uses the option name --explain and allows either a symbol name and a filename pattern. Do you have a preference of using an alternative name, or using different option names?

I don't have an opinion on the flag name. --explain sounds fine, and it working with both file and symbol names seems fine too.