This is an archive of the discontinued LLVM Phabricator instance.

[lld-link] diagnose undefined symbols before LTO when possible
ClosedPublic

Authored by inglorion on May 24 2019, 3:51 PM.

Details

Summary

This allows reporting undefined symbols before LTO codegen is
run. Since LTO codegen can take a long time, this improves user
experience by avoiding that time spend if the link is going to
fail with undefined symbols anyway.

Fixes PR32400.

Event Timeline

inglorion created this revision.May 24 2019, 3:51 PM
inglorion updated this revision to Diff 210470.Jul 17 2019, 6:05 PM
inglorion retitled this revision from [WIP][lld-link] diagnose undefined symbols before LTO if possible to [lld-link] diagnose undefined symbols before LTO when possible.
inglorion edited the summary of this revision. (Show Details)
inglorion added a reviewer: ruiu.

Ready for review now.

inglorion updated this revision to Diff 210472.Jul 17 2019, 6:13 PM

removed redundant check

inglorion updated this revision to Diff 210474.Jul 17 2019, 6:22 PM

don't process bitcode files in resolveRemainindUndefines

The basic idea here is:

reportRemainingUndefines did two things: it attempts to resolve undefined symbols, and it reports any that remain undefined.

This splits reportRemainingUndefines into two functions, one which reports symbols which we know are unresolvable (reportUnresolvable, which we run before LTO) and one which actually does the resolution (resolveRemainingUndefines, which we run after LTO).

Then there is a little bit of added code report undefined symbols in bitcode files.

ruiu added inline comments.Jul 17 2019, 9:48 PM
lld/COFF/SymbolTable.cpp
86

Source -> source

90

nit: maybe it is a bit more straightforward if you define res as std::string res and return a vector with return {res};?

90

Or, maybe it is better to avoid ostream and just append strings to res using +=.

145

This line is unreachable as you explicitly asserted it by assert, no?

I think you can change the second condition to return getSymbolLocations(cast<BitcodeFile>(file));

256–323

Can you expand the comment to explain what this function does a little bit more in detail?

inglorion marked 6 inline comments as done.

addressed @ruiu's comments

inglorion added inline comments.Jul 18 2019, 10:53 AM
lld/COFF/SymbolTable.cpp
145

This line is unreachable as you explicitly asserted it by assert, no?

When asserts are enabled, yes. But they aren't always.

I'm not super concerned that we would ever change this to pass in other types of file and forget to update it, but I kind of like the behavior as it is: if we do that with asserts enabled, we will get an informative message, and if we do it with asserts disabled, just don't return any symbol locations.

That said, if you want me to change it, I will.

256–323

Is this better?

added comment explaining BitcodeFiles::instances.empty() check

Linking ThinLTO chrome.dll with an undefined symbol:

Without this change: ~13 minutes.
With this change: ~4 seconds.

ruiu added a comment.Jul 19 2019, 2:33 AM

Generally looking ogod.

lld/COFF/SymbolTable.cpp
145

I feel like returning a "reasonable" value for an impossible condition does more harm than mitigating a pain. When a program reaches here under an unintended condition, something bad has already occurred, and we should report it of swallowing it.

How about changing the last line to llvm_unreachable?

inglorion updated this revision to Diff 210910.Jul 19 2019, 2:52 PM

replaced assert with llvm_unreachable

ruiu accepted this revision.Jul 24 2019, 12:10 PM

LGTM

lld/COFF/SymbolTable.cpp
144

I don't think you need this return after an unreachable statement. I believe all C++ compilers that are smart enough to recognize that this function always return a value or abort. (If it would fail on a buildbot, we can add this line later.)

This revision is now accepted and ready to land.Jul 24 2019, 12:10 PM
inglorion marked an inline comment as done.Jul 25 2019, 8:51 AM
inglorion added inline comments.
lld/COFF/SymbolTable.cpp
144

I grepped the LLVM codebase, and it seems that in other places, we follow llvm_unreachable() with return. I think it's safer and more consistent to leave this return in here, so unless you object I'd like to ship this as it stands.

This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Jul 31 2019, 4:57 PM
thakis added inline comments.
lld/trunk/COFF/SymbolTable.cpp
329 ↗(On Diff #211970)

A bot notices:

lld/COFF/SymbolTable.cpp:329:18: warning: unused variable ‘d’ [-Wunused-variable]
     if (Defined *d = undef->getWeakAlias())
                  ^

I'd just remove d but I'm not sure if you intended to do something with it.