This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
ELF/
-
Writer.cpp

Differential D53393

Add a addAbsolute static function to Writer.cpp
ClosedPublic

Authored by arichardson on Oct 18 2018, 6:04 AM.

Download Raw Diff

Details

Reviewers

ruiu
• espindola

Commits

rG39adc6df0d0a: Add an addAbsolute static function to Writer.cpp
rLLD344842: Add an addAbsolute static function to Writer.cpp
rL344842: Add an addAbsolute static function to Writer.cpp

Summary

SymbolTable::addAbsolute() was removed in rL344305.
To me is more readable than the lambda named Add and in our out-of-tree
CHERI target we use addAbsolute() in another function.

Diff Detail

Repository

rLLD LLVM Linker

Build Status

Buildable 23896
Build 23895: arc lint + arc unit

Event Timeline

arichardson created this revision.Oct 18 2018, 6:04 AM

Herald added a reviewer: • espindola. · View Herald TranscriptOct 18 2018, 6:04 AM

Herald added subscribers: llvm-commits, emaste. · View Herald Transcript

Harbormaster completed remote builds in B23896: Diff 170074.Oct 18 2018, 6:04 AM

For what it's worth, I'm in favor, it seems friendlier to out of tree targets, and I agree on the readability remark. Just my two cents in.

arichardson mentioned this in rL344305: Remove SymbolTable::addAbsolute()..Oct 18 2018, 6:41 AM

First, let me explain my motivation as to why I made changes to the symbol table in the first place, so that you understand the background of my changes.

One of the most important goals of lld is speed, and we know that hash table lookup is one of the slowest operations in the linker because hash table lookup is not very fast and we have massive number of symbols to be inserted to a hash table. In lld, we strictly do only one symbol table lookup per a symbol, which I believe the minimum number of hash table lookup, to make the linker as fast as possible.

But I noticed a few weeks ago that we do insert the same symbol more than once if we are using --start-lib and --end-lib options. These options give archive file semantics to object files between the options, so that you can avoid using "ar" command (which tend to be slow) to create archive files from object files. Let's call object files between --{start,end}-lib lazy objects. In lld, when we visit lazy objects for the first time, we add lazy symbols for lazy objects as placeholders, to memorize that symbols in lazy objects can be resolvable if we really need them. When the linker knows that it needs lazy objects, it then create real symbols (defined and undefined) to replace lazy symbols. To do that, it looks up the symbol table again with the same set of symbols. That is a violation of our design policy.

So, I started fixing it, and noticed that that's not very easy to do, because all SymbolTable functions takes not a Symbol but a symbol name (which is StringRef) as a key. As long as we are using that interface, we cannot avoid hash table lookups. I also noticed that SymbolTable has too many utility functions that are not really necessary, which makes it hard to make changes to the code. So, I started off with simplifying the class first.

That is the motivation to reduce number of SymbolTable's member functions. My goal was not achieved yet, but I needed to simplify it first.

Back to your change, this change itself seems neutral to me from the readability perspective, but it is *very* likely to "break" again, possibly soon, because this is not really a public API or anything, and I didn't finish my job there yet. Are you fine with that? You can always write the same function in your local patch.

Thank you very much for the detailed explanation.

I am totally fine with future breakage if you don't mind me committing this.
In addition to fixing our out-of-tree target, this patch really helps my understanding of the function because there is another lambda named Add a few lines down that adds a symbol that might point at the elf header instead of being an absolute symbol. If you'd rather keep it as a local lambda instead of a static function just before, I could also rename it to AddAbsolute

LGTM

Please commit.

This revision is now accepted and ready to land.Oct 19 2018, 1:53 PM

Closed by commit rL344842: Add an addAbsolute static function to Writer.cpp (authored by arichardson). · Explain WhyOct 20 2018, 4:14 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

ELF/

Writer.cpp

16 lines

Diff 170074

ELF/Writer.cpp

Show First 20 Lines • Show All 170 Lines • ▼ Show 20 Lines	static Defined addOptionalRegular(StringRef Name, SectionBase Sec,
if (!S \|\| S->isDefined())		if (!S \|\| S->isDefined())
return nullptr;		return nullptr;
Symbol *Sym = Symtab->addDefined(Name, StOther, STT_NOTYPE, Val,		Symbol *Sym = Symtab->addDefined(Name, StOther, STT_NOTYPE, Val,
/Size=/0, Binding, Sec,		/Size=/0, Binding, Sec,
/File=/nullptr);		/File=/nullptr);
return cast<Defined>(Sym);		return cast<Defined>(Sym);
}		}

		static Defined *addAbsolute(StringRef Name) {
		return cast<Defined>(Symtab->addDefined(Name, STV_HIDDEN, STT_NOTYPE, 0, 0,
		STB_GLOBAL, nullptr, nullptr));
		};

// The linker is expected to define some symbols depending on		// The linker is expected to define some symbols depending on
// the linking result. This function defines such symbols.		// the linking result. This function defines such symbols.
void elf::addReservedSymbols() {		void elf::addReservedSymbols() {
if (Config->EMachine == EM_MIPS) {		if (Config->EMachine == EM_MIPS) {
auto Add = [](StringRef Name) {
return cast<Defined>(Symtab->addDefined(Name, STV_HIDDEN, STT_NOTYPE, 0,
0, STB_GLOBAL, nullptr, nullptr));
};

// Define _gp for MIPS. st_value of _gp symbol will be updated by Writer		// Define _gp for MIPS. st_value of _gp symbol will be updated by Writer
// so that it points to an absolute address which by default is relative		// so that it points to an absolute address which by default is relative
// to GOT. Default offset is 0x7ff0.		// to GOT. Default offset is 0x7ff0.
// See "Global Data Symbols" in Chapter 6 in the following document:		// See "Global Data Symbols" in Chapter 6 in the following document:
// ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf		// ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf
ElfSym::MipsGp = Add("_gp");		ElfSym::MipsGp = addAbsolute("_gp");

// On MIPS O32 ABI, _gp_disp is a magic symbol designates offset between		// On MIPS O32 ABI, _gp_disp is a magic symbol designates offset between
// start of function and 'gp' pointer into GOT.		// start of function and 'gp' pointer into GOT.
if (Symtab->find("_gp_disp"))		if (Symtab->find("_gp_disp"))
ElfSym::MipsGpDisp = Add("_gp_disp");		ElfSym::MipsGpDisp = addAbsolute("_gp_disp");

// The __gnu_local_gp is a magic symbol equal to the current value of 'gp'		// The __gnu_local_gp is a magic symbol equal to the current value of 'gp'
// pointer. This symbol is used in the code generated by .cpload pseudo-op		// pointer. This symbol is used in the code generated by .cpload pseudo-op
// in case of using -mno-shared option.		// in case of using -mno-shared option.
// https://sourceware.org/ml/binutils/2004-12/msg00094.html		// https://sourceware.org/ml/binutils/2004-12/msg00094.html
if (Symtab->find("__gnu_local_gp"))		if (Symtab->find("__gnu_local_gp"))
ElfSym::MipsLocalGp = Add("__gnu_local_gp");		ElfSym::MipsLocalGp = addAbsolute("__gnu_local_gp");
}		}

// The Power Architecture 64-bit v2 ABI defines a TableOfContents (TOC) which		// The Power Architecture 64-bit v2 ABI defines a TableOfContents (TOC) which
// combines the typical ELF GOT with the small data sections. It commonly		// combines the typical ELF GOT with the small data sections. It commonly
// includes .got .toc .sdata .sbss. The .TOC. symbol replaces both		// includes .got .toc .sdata .sbss. The .TOC. symbol replaces both
// _GLOBAL_OFFSET_TABLE_ and _SDA_BASE_ from the 32-bit ABI. It is used to		// _GLOBAL_OFFSET_TABLE_ and _SDA_BASE_ from the 32-bit ABI. It is used to
// represent the TOC base which is offset by 0x8000 bytes from the start of		// represent the TOC base which is offset by 0x8000 bytes from the start of
// the .got section.		// the .got section.
▲ Show 20 Lines • Show All 2,224 Lines • Show Last 20 Lines