This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lld/ELF/
-
ELF/
-
Config.h
-
Driver.cpp
-
Options.td
-
SyntheticSections.h
-
SyntheticSections.cpp
-
Writer.cpp

Differential D51833

ELF: Add --build-id-link-dir=DIR switch
Needs ReviewPublic

Authored by phosek on Sep 8 2018, 3:20 PM.

Download Raw Diff

Details

Reviewers

jakehehrlich
• espindola
mcgrathr

Diff Detail

Repository: rLLD LLVM Linker

Event Timeline

mcgrathr created this revision.Sep 8 2018, 3:20 PM

Herald added a reviewer: • espindola. · View Herald TranscriptSep 8 2018, 3:20 PM

Herald added subscribers: llvm-commits, MaskRay, arichardson, emaste. · View Herald Transcript

phosek mentioned this in D51835: [ADT] Support converting to lowercase string in toHex.Sep 8 2018, 5:53 PM

This seems like a new feature proposal, and we haven't discussed this before. It's not clear to me why you have to do this inside the linker rather than a post-processing tool. Could you please elaborate about why you want to add a new option?

phosek mentioned this in rL341852: [ADT] Support converting to lowercase string in toHex.Sep 10 2018, 12:36 PM

In D51833#1229002, @ruiu wrote:

This seems like a new feature proposal, and we haven't discussed this before. It's not clear to me why you have to do this inside the linker rather than a post-processing tool. Could you please elaborate about why you want to add a new option?

The .build-id/xx/xxx.debug lookup protocol is already used by various tools such as debuggers (e.g. here's the logic in LLDB) or Linux packagers. It's also supported by other ELF tools such as elfutils. We'd like to use it in Fuchsia as well and integrate it into our build system.

The problem is that determining the build-id after linking is done is not very straightforward, you can use a tool like llvm-readobj, but that tools doesn't have a machine-readable output so you need to parse the output which is error prone and adds an extra overhead. It also means that in our build system we'd need to add additional step or wrap linking in a script complicating things, especially if you want to do it in a portable fashion that's going to work on Linux, Windows and macOS.

Initially we considered adding a flag to lld that would allows writing the build-id into a file which would allow something like ld.lld --build-id-file=>(id=$(</dev/fd/0); mkdir -p build-id/${id:0:2} && ln -f {{output}} build-id/${id:0:2}/${id:2}), but this again is not a portable solution which won't work on Windows. So we instead tried prototyping the support for linking the output file directly into the .build-id/xx/xxx.debug layout to see how complicated it'd be, and turned out it's actually pretty straightforward as you see here (I have even more simplified version that eliminates the hex representation computation using llvm::toHex instead).

Would this be an acceptable addition? It'd simplify our build system and this solution should be working across all platforms without any extra effort. If this would be fine with you, I'm going to update the change and also write some tests.

phosek commandeered this revision.Sep 10 2018, 4:45 PM

phosek updated this revision to Diff 164767.

phosek edited reviewers, added: mcgrathr; removed: phosek.

I think I don't understand the use case of the feature yet.

If you hard-link two files, they have the identical contents (strictly speaking there is only one file with two filenames). If a debugger can find an executable having debug info in .build-id/xx/xxxxx directory, it should be able to find it in the executable that's being debugged. So, how does it work?

If it is the only problem that llvm-objdump's output is not machine-readable, you can add a new option to llvm-objdump to print out a build-id, can't you?

In D51833#1229776, @ruiu wrote:

I think I don't understand the use case of the feature yet.

If you hard-link two files, they have the identical contents (strictly speaking there is only one file with two filenames). If a debugger can find an executable having debug info in .build-id/xx/xxxxx directory, it should be able to find it in the executable that's being debugged. So, how does it work?

We would strip the binary and that's what's being executed. This doesn't apply only to executables, but also to shared libraries. So when the debugger connects to a process, it'd find all the ELF files mapped into memory and see their build-ids, from there it needs a way to map those back to files that contain the debug information. On Fuchsia, we're always cross-compiling and producing a system image as the output that contains all stripped binaries, but we need to keep debugging binaries around so we can debug or symbolize them remotely. Ideally we would then point all debugging tools at the .build-id root to lookup the binaries using their build-id.

If it is the only problem that llvm-objdump's output is not machine-readable, you can add a new option to llvm-objdump to print out a build-id, can't you?

We considered it but that's going to be really complicated since llvm-objdump/llvm-readobj doesn't have any reasonable in-memory representation that we can easily serialize into a machine readable format (e.g. JSON), those tools are really optimized for printing out (formatted) output. Also this solution still requires some post-processing and we would need to ensure that this works on all platforms.

Revision Contents

Path

Size

lld/

ELF/

1 line

1 line

5 lines

2 lines

SyntheticSections.cpp

4 lines

Writer.cpp

40 lines

Diff 164767

lld/ELF/Config.h

	Show First 20 Lines • Show All 78 Lines • ▼ Show 20 Lines
	// This struct contains the global configuration for the linker.			// This struct contains the global configuration for the linker.
	// Most fields are direct mapping from the command line options			// Most fields are direct mapping from the command line options
	// and such fields have the same name as the corresponding options.			// and such fields have the same name as the corresponding options.
	// Most fields are initialized by the driver.			// Most fields are initialized by the driver.
	struct Configuration {			struct Configuration {
	uint8_t OSABI = 0;			uint8_t OSABI = 0;
	llvm::CachePruningPolicy ThinLTOCachePolicy;			llvm::CachePruningPolicy ThinLTOCachePolicy;
	llvm::StringMap<uint64_t> SectionStartMap;			llvm::StringMap<uint64_t> SectionStartMap;
				llvm::StringRef BuildIdLinkDir;
	llvm::StringRef Chroot;			llvm::StringRef Chroot;
	llvm::StringRef DynamicLinker;			llvm::StringRef DynamicLinker;
	llvm::StringRef DwoDir;			llvm::StringRef DwoDir;
	llvm::StringRef Entry;			llvm::StringRef Entry;
	llvm::StringRef Emulation;			llvm::StringRef Emulation;
	llvm::StringRef Fini;			llvm::StringRef Fini;
	llvm::StringRef Init;			llvm::StringRef Init;
	llvm::StringRef LTOAAPipeline;			llvm::StringRef LTOAAPipeline;
	▲ Show 20 Lines • Show All 192 Lines • Show Last 20 Lines

lld/ELF/Driver.cpp

Show First 20 Lines • Show All 724 Lines • ▼ Show 20 Lines	void LinkerDriver::readConfigs(opt::InputArgList &Args) {

Config->AllowMultipleDefinition =		Config->AllowMultipleDefinition =
Args.hasFlag(OPT_allow_multiple_definition,		Args.hasFlag(OPT_allow_multiple_definition,
OPT_no_allow_multiple_definition, false) \|\|		OPT_no_allow_multiple_definition, false) \|\|
hasZOption(Args, "muldefs");		hasZOption(Args, "muldefs");
Config->AuxiliaryList = args::getStrings(Args, OPT_auxiliary);		Config->AuxiliaryList = args::getStrings(Args, OPT_auxiliary);
Config->Bsymbolic = Args.hasArg(OPT_Bsymbolic);		Config->Bsymbolic = Args.hasArg(OPT_Bsymbolic);
Config->BsymbolicFunctions = Args.hasArg(OPT_Bsymbolic_functions);		Config->BsymbolicFunctions = Args.hasArg(OPT_Bsymbolic_functions);
		Config->BuildIdLinkDir = Args.getLastArgValue(OPT_build_id_link_dir);
Config->CheckSections =		Config->CheckSections =
Args.hasFlag(OPT_check_sections, OPT_no_check_sections, true);		Args.hasFlag(OPT_check_sections, OPT_no_check_sections, true);
Config->Chroot = Args.getLastArgValue(OPT_chroot);		Config->Chroot = Args.getLastArgValue(OPT_chroot);
Config->CompressDebugSections = getCompressDebugSections(Args);		Config->CompressDebugSections = getCompressDebugSections(Args);
Config->Cref = Args.hasFlag(OPT_cref, OPT_no_cref, false);		Config->Cref = Args.hasFlag(OPT_cref, OPT_no_cref, false);
Config->DefineCommon = Args.hasFlag(OPT_define_common, OPT_no_define_common,		Config->DefineCommon = Args.hasFlag(OPT_define_common, OPT_no_define_common,
!Args.hasArg(OPT_relocatable));		!Args.hasArg(OPT_relocatable));
Config->Demangle = Args.hasFlag(OPT_demangle, OPT_no_demangle, true);		Config->Demangle = Args.hasFlag(OPT_demangle, OPT_no_demangle, true);
▲ Show 20 Lines • Show All 869 Lines • Show Last 20 Lines

lld/ELF/Options.td

	Show All 26 Lines

	def Bstatic: F<"Bstatic">, HelpText<"Do not link against shared libraries">;			def Bstatic: F<"Bstatic">, HelpText<"Do not link against shared libraries">;

	def build_id: F<"build-id">, HelpText<"Alias for --build-id=fast">;			def build_id: F<"build-id">, HelpText<"Alias for --build-id=fast">;

	def build_id_eq: J<"build-id=">, HelpText<"Generate build ID note">,			def build_id_eq: J<"build-id=">, HelpText<"Generate build ID note">,
	MetaVarName<"[fast,md5,sha,uuid,0x<hexstring>]">;			MetaVarName<"[fast,md5,sha,uuid,0x<hexstring>]">;

				defm build_id_link_dir:
				Eq<"build-id-link-dir",
				"Hard-link output to <dir>/xx/xxx name derived from hex build ID">,
				MetaVarName<"<dir>">;

	defm check_sections: B<"check-sections",			defm check_sections: B<"check-sections",
	"Check section addresses for overlaps (default)",			"Check section addresses for overlaps (default)",
	"Do not check section addresses for overlaps">;			"Do not check section addresses for overlaps">;

	defm compress_debug_sections:			defm compress_debug_sections:
	Eq<"compress-debug-sections", "Compress DWARF debug sections">,			Eq<"compress-debug-sections", "Compress DWARF debug sections">,
	MetaVarName<"[none,zlib]">;			MetaVarName<"[none,zlib]">;

	▲ Show 20 Lines • Show All 457 Lines • Show Last 20 Lines

lld/ELF/SyntheticSections.h

	Show First 20 Lines • Show All 147 Lines • ▼ Show 20 Lines
	class BuildIdSection : public SyntheticSection {			class BuildIdSection : public SyntheticSection {
	// First 16 bytes are a header.			// First 16 bytes are a header.
	static const unsigned HeaderSize = 16;			static const unsigned HeaderSize = 16;

	public:			public:
	BuildIdSection();			BuildIdSection();
	void writeTo(uint8_t *Buf) override;			void writeTo(uint8_t *Buf) override;
	size_t getSize() const override { return HeaderSize + HashSize; }			size_t getSize() const override { return HeaderSize + HashSize; }
	void writeBuildId(llvm::ArrayRef<uint8_t> Buf);			llvm::SmallVector<uint8_t, 32> writeBuildId(llvm::ArrayRef<uint8_t> Buf);

	private:			private:
	void computeHash(llvm::ArrayRef<uint8_t> Buf,			void computeHash(llvm::ArrayRef<uint8_t> Buf,
	std::function<void(uint8_t *, ArrayRef<uint8_t>)> Hash);			std::function<void(uint8_t *, ArrayRef<uint8_t>)> Hash);

	size_t HashSize;			size_t HashSize;
	uint8_t *HashBuf;			uint8_t *HashBuf;
	};			};
	▲ Show 20 Lines • Show All 865 Lines • Show Last 20 Lines

lld/ELF/SyntheticSections.cpp

Show First 20 Lines • Show All 325 Lines • ▼ Show 20 Lines
}		}

BssSection::BssSection(StringRef Name, uint64_t Size, uint32_t Alignment)		BssSection::BssSection(StringRef Name, uint64_t Size, uint32_t Alignment)
: SyntheticSection(SHF_ALLOC \| SHF_WRITE, SHT_NOBITS, Alignment, Name) {		: SyntheticSection(SHF_ALLOC \| SHF_WRITE, SHT_NOBITS, Alignment, Name) {
this->Bss = true;		this->Bss = true;
this->Size = Size;		this->Size = Size;
}		}

void BuildIdSection::writeBuildId(ArrayRef<uint8_t> Buf) {		llvm::SmallVector<uint8_t, 32>
		BuildIdSection::writeBuildId(ArrayRef<uint8_t> Buf) {
switch (Config->BuildId) {		switch (Config->BuildId) {
case BuildIdKind::Fast:		case BuildIdKind::Fast:
computeHash(Buf, [](uint8_t *Dest, ArrayRef<uint8_t> Arr) {		computeHash(Buf, [](uint8_t *Dest, ArrayRef<uint8_t> Arr) {
write64le(Dest, xxHash64(Arr));		write64le(Dest, xxHash64(Arr));
});		});
break;		break;
case BuildIdKind::Md5:		case BuildIdKind::Md5:
computeHash(Buf, [](uint8_t *Dest, ArrayRef<uint8_t> Arr) {		computeHash(Buf, [](uint8_t *Dest, ArrayRef<uint8_t> Arr) {
Show All 10 Lines	if (auto EC = getRandomBytes(HashBuf, HashSize))
error("entropy source failure: " + EC.message());		error("entropy source failure: " + EC.message());
break;		break;
case BuildIdKind::Hexstring:		case BuildIdKind::Hexstring:
memcpy(HashBuf, Config->BuildIdVector.data(), Config->BuildIdVector.size());		memcpy(HashBuf, Config->BuildIdVector.data(), Config->BuildIdVector.size());
break;		break;
default:		default:
llvm_unreachable("unknown BuildIdKind");		llvm_unreachable("unknown BuildIdKind");
}		}
		return {HashBuf, HashBuf + HashSize};
}		}

EhFrameSection::EhFrameSection()		EhFrameSection::EhFrameSection()
: SyntheticSection(SHF_ALLOC, SHT_PROGBITS, 1, ".eh_frame") {}		: SyntheticSection(SHF_ALLOC, SHT_PROGBITS, 1, ".eh_frame") {}

// Search for an existing CIE record or create a new one.		// Search for an existing CIE record or create a new one.
// CIE records from input object files are uniquified by their contents		// CIE records from input object files are uniquified by their contents
// and where their relocations point to.		// and where their relocations point to.
▲ Show 20 Lines • Show All 2,785 Lines • Show Last 20 Lines

lld/ELF/Writer.cpp

Show All 19 Lines
#include "Symbols.h"		#include "Symbols.h"
#include "SyntheticSections.h"		#include "SyntheticSections.h"
#include "Target.h"		#include "Target.h"
#include "lld/Common/Memory.h"		#include "lld/Common/Memory.h"
#include "lld/Common/Strings.h"		#include "lld/Common/Strings.h"
#include "lld/Common/Threads.h"		#include "lld/Common/Threads.h"
#include "llvm/ADT/StringMap.h"		#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
		#include "llvm/ADT/StringExtras.h"
		#include "llvm/Support/Path.h"
#include <climits>		#include <climits>

using namespace llvm;		using namespace llvm;
using namespace llvm::ELF;		using namespace llvm::ELF;
using namespace llvm::object;		using namespace llvm::object;
using namespace llvm::support;		using namespace llvm::support;
using namespace llvm::support::endian;		using namespace llvm::support::endian;

Show All 30 Lines	private:
void checkSections();		void checkSections();
void fixSectionAlignments();		void fixSectionAlignments();
void openFile();		void openFile();
void writeTrapInstr();		void writeTrapInstr();
void writeHeader();		void writeHeader();
void writeSections();		void writeSections();
void writeSectionsBinary();		void writeSectionsBinary();
void writeBuildId();		void writeBuildId();
		void writeBuildIdLink();

std::unique_ptr<FileOutputBuffer> &Buffer;		std::unique_ptr<FileOutputBuffer> &Buffer;

void addRelIpltSymbols();		void addRelIpltSymbols();
void addStartEndSymbols();		void addStartEndSymbols();
void addStartStopSymbols(OutputSection *Sec);		void addStartStopSymbols(OutputSection *Sec);
uint64_t getEntryAddr();		uint64_t getEntryAddr();

std::vector<PhdrEntry *> Phdrs;		std::vector<PhdrEntry *> Phdrs;

uint64_t FileSize;		uint64_t FileSize;
uint64_t SectionHeaderOff;		uint64_t SectionHeaderOff;

		SmallVector<uint8_t, 32> BuildIdBytes;
};		};
} // anonymous namespace		} // anonymous namespace

static bool isSectionPrefix(StringRef Prefix, StringRef Name) {		static bool isSectionPrefix(StringRef Prefix, StringRef Name) {
return Name.startswith(Prefix) \|\| Name == Prefix.drop_back();		return Name.startswith(Prefix) \|\| Name == Prefix.drop_back();
}		}

StringRef elf::getOutputSectionName(const InputSectionBase *S) {		StringRef elf::getOutputSectionName(const InputSectionBase *S) {
▲ Show 20 Lines • Show All 423 Lines • ▼ Show 20 Lines	template <class ELFT> void Writer<ELFT>::run() {
// Handle -Map and -cref options.		// Handle -Map and -cref options.
writeMapFile();		writeMapFile();
writeCrossReferenceTable();		writeCrossReferenceTable();
if (errorCount())		if (errorCount())
return;		return;

if (auto E = Buffer->commit())		if (auto E = Buffer->commit())
error("failed to write to the output file: " + toString(std::move(E)));		error("failed to write to the output file: " + toString(std::move(E)));

		writeBuildIdLink();
}		}

static bool shouldKeepInSymtab(SectionBase *Sec, StringRef SymName,		static bool shouldKeepInSymtab(SectionBase *Sec, StringRef SymName,
const Symbol &B) {		const Symbol &B) {
if (B.isSection())		if (B.isSection())
return false;		return false;

if (Config->Discard == DiscardPolicy::None)		if (Config->Discard == DiscardPolicy::None)
▲ Show 20 Lines • Show All 1,873 Lines • ▼ Show 20 Lines

template <class ELFT> void Writer<ELFT>::writeBuildId() {		template <class ELFT> void Writer<ELFT>::writeBuildId() {
if (!InX::BuildId \|\| !InX::BuildId->getParent())		if (!InX::BuildId \|\| !InX::BuildId->getParent())
return;		return;

// Compute a hash of all sections of the output file.		// Compute a hash of all sections of the output file.
uint8_t *Start = Buffer->getBufferStart();		uint8_t *Start = Buffer->getBufferStart();
uint8_t *End = Start + FileSize;		uint8_t *End = Start + FileSize;
InX::BuildId->writeBuildId({Start, End});		BuildIdBytes = InX::BuildId->writeBuildId({Start, End});
		}

		template <class ELFT> void Writer<ELFT>::writeBuildIdLink() {
		if (Config->BuildIdLinkDir.empty() \|\|
		!InX::BuildId \|\| !InX::BuildId->getParent())
		return;

		if (BuildIdBytes.size() < 2) {
		error("build ID too small for --build-id-link-dir");
		return;
		}

		SmallString<128> Path(Config->BuildIdLinkDir);
		llvm::sys::path::append(Path, llvm::toHex(BuildIdBytes[0], true));
		if (auto EC = sys::fs::create_directories(Path)) {
		error("cannot create build ID link directory " +
		Path + ": " + EC.message());
		return;
		}

		llvm::sys::path::append(Path, llvm::toHex(
		ArrayRef<uint8_t>(&BuildIdBytes.begin()[1], BuildIdBytes.end()), true));
		auto EC = llvm::sys::fs::create_hard_link(Config->OutputFile, Path);
		if (EC) {
		// Hard linking failed, try to remove the file first if it exists.
		if (llvm::sys::fs::exists(Path))
		llvm::sys::fs::remove(Path, true);
		EC = llvm::sys::fs::create_hard_link(Config->OutputFile, Path);
		if (EC)
		error("cannot link " + Path + ": " + EC.message());
		}
}		}

template void elf::writeResult<ELF32LE>();		template void elf::writeResult<ELF32LE>();
template void elf::writeResult<ELF32BE>();		template void elf::writeResult<ELF32BE>();
template void elf::writeResult<ELF64LE>();		template void elf::writeResult<ELF64LE>();
template void elf::writeResult<ELF64BE>();		template void elf::writeResult<ELF64BE>();