This is an archive of the discontinued LLVM Phabricator instance.

Add a addAbsolute static function to Writer.cpp
ClosedPublic

Authored by arichardson on Oct 18 2018, 6:04 AM.

Details

Summary

SymbolTable::addAbsolute() was removed in rL344305.
To me is more readable than the lambda named Add and in our out-of-tree
CHERI target we use addAbsolute() in another function.

Event Timeline

arichardson created this revision.Oct 18 2018, 6:04 AM

For what it's worth, I'm in favor, it seems friendlier to out of tree targets, and I agree on the readability remark. Just my two cents in.

ruiu added a comment.Oct 18 2018, 6:58 AM

First, let me explain my motivation as to why I made changes to the symbol table in the first place, so that you understand the background of my changes.

One of the most important goals of lld is speed, and we know that hash table lookup is one of the slowest operations in the linker because hash table lookup is not very fast and we have massive number of symbols to be inserted to a hash table. In lld, we strictly do only one symbol table lookup per a symbol, which I believe the minimum number of hash table lookup, to make the linker as fast as possible.

But I noticed a few weeks ago that we do insert the same symbol more than once if we are using --start-lib and --end-lib options. These options give archive file semantics to object files between the options, so that you can avoid using "ar" command (which tend to be slow) to create archive files from object files. Let's call object files between --{start,end}-lib lazy objects. In lld, when we visit lazy objects for the first time, we add lazy symbols for lazy objects as placeholders, to memorize that symbols in lazy objects can be resolvable if we really need them. When the linker knows that it needs lazy objects, it then create real symbols (defined and undefined) to replace lazy symbols. To do that, it looks up the symbol table again with the same set of symbols. That is a violation of our design policy.

So, I started fixing it, and noticed that that's not very easy to do, because all SymbolTable functions takes not a Symbol but a symbol name (which is StringRef) as a key. As long as we are using that interface, we cannot avoid hash table lookups. I also noticed that SymbolTable has too many utility functions that are not really necessary, which makes it hard to make changes to the code. So, I started off with simplifying the class first.

That is the motivation to reduce number of SymbolTable's member functions. My goal was not achieved yet, but I needed to simplify it first.

Back to your change, this change itself seems neutral to me from the readability perspective, but it is *very* likely to "break" again, possibly soon, because this is not really a public API or anything, and I didn't finish my job there yet. Are you fine with that? You can always write the same function in your local patch.

Thank you very much for the detailed explanation.

I am totally fine with future breakage if you don't mind me committing this.
In addition to fixing our out-of-tree target, this patch really helps my understanding of the function because there is another lambda named Add a few lines down that adds a symbol that might point at the elf header instead of being an absolute symbol. If you'd rather keep it as a local lambda instead of a static function just before, I could also rename it to AddAbsolute

ruiu accepted this revision.Oct 19 2018, 1:53 PM

LGTM

Please commit.

This revision is now accepted and ready to land.Oct 19 2018, 1:53 PM
This revision was automatically updated to reflect the committed changes.