This is an archive of the discontinued LLVM Phabricator instance.

LLD: Core: Make the resolver faster
Needs ReviewPublic

Authored by ruiu on Mar 5 2015, 3:19 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

In the resolver, we maintain a list of undefined symbols, and when we
visit an archive file, we check that file if undefined symbols can be
resolved using files in the archive. The archive file class provides
a find() function to lookup a symbol.

Previously, we call find() for each undefined symbols. Archive files
may be visited multiple times if they are in a --start-group and
--end-group. If we visit a file M times and if we have N undefined
symbols, find() is called M*N times. I found that that is one of the
most significant bottlenecks in LLD when linking a large executable.

find() is not a very cheap operation because it looks up a hash table
for a given string. And a string, or a symbol name, can be pretty long
if you are dealing with C++ symbols.

We can eliminate the bottleneck.

Calling find() with the same symbol multiple times is a waste. If a
result of looking up a symbol is "not found", it stays "not found"
forever because the symbol simply doesn't exist in the archive.
Thus, we should call find() only for newly-added undefined symbols.
This optimization makes O(M*N) O(N).

In this patch, all undefined symbols are added to a vector. For each
archive/shared library file, we maintain a start position P. All
symbols [0, P) are already searched. [P, end of the vector) are not
searched yet. For each file, we scan the vector only once.

This patch changes the order in which undefined symbols are looked for.
Previously, we iterated over the result _symbolTable.undefines().
Now we iterate over the new vector. This is a benign change but caused
differences in output if remaining undefines exist. This is why some
tests are updated.

The performance improvement of this patch seems sometimes significant.
Previously, linking chrome.dll on my workstation (Xeon 2.4GHz 8 cores)
took about 70 seconds. Now it takes (only?) 30 seconds!

Diff Detail

Event Timeline

ruiu updated this revision to Diff 21318.Mar 5 2015, 3:19 PM
ruiu retitled this revision from to LLD: Core: Make the resolver faster.
ruiu updated this object.
ruiu edited the test plan for this revision. (Show Details)
ruiu added a project: lld.
ruiu updated this object.
ruiu added a subscriber: Unknown Object (MLST).
ruiu updated this object.Mar 5 2015, 3:26 PM
davide added a subscriber: davide.Mar 5 2015, 4:26 PM

After a closer look to the patch I can say this looks good to me. Please commit this.

ruiu added a comment.Mar 5 2015, 4:31 PM

Thanks. Committed as r231434.

emaste added a subscriber: emaste.Mar 5 2015, 6:11 PM