Download Raw Diff

Details

Reviewers

pcc
davide
• rafael

Summary

This should be ready'ish for review now.

Diff Detail

Event Timeline

davide updated this revision to Diff 53344.Apr 11 2016, 6:17 PM

davide retitled this revision from to parallel LTO wip.

davide updated this object.

davide added reviewers: davide, pcc, • rafael.

Herald added a subscriber: mehdi_amini. · View Herald TranscriptApr 11 2016, 6:17 PM

With/without patch (4 threads with the patch): Release + DebugInfo (without assert) linking clang.

real    22m27.724s
user    34m21.325s
sys     0m28.535s

real    28m10.310s
user    27m59.016s
sys     0m10.189s

What program is that?

In D18999#397971, @pcc wrote:

What program is that?

Updated comment, sorry hit 'submit' too quickly.

Release without debug info (clang):

real    14m56.917s
user    23m47.331s
sys     0m8.045s

real    23m17.522s
user    23m13.189s
sys     0m3.422s

davide updated this revision to Diff 53504.Apr 12 2016, 5:58 PM

davide retitled this revision from parallel LTO wip to [ELF/LTO] Parallel Codegen for LLD.

davide updated this object.

davide edited edge metadata.

davide added a subscriber: llvm-commits.

ruiu added a subscriber: ruiu.Apr 12 2016, 6:09 PM

ruiu added inline comments.

ELF/LTO.cpp
166	Because NumThreads is 1, nothing would run in parallel?

davide added inline comments.Apr 12 2016, 6:24 PM

ELF/LTO.cpp
166	Yes, I have another patch to add an option to specify the number of threads. I put this up to review to understand if the logic is sound. I'm collecting numbers in multiple configurations (1,2,3,4 threads). The previous comments in the review have some very early numbers using 4 threads. Cheers, Davide

On a second thought, added the option (in case also somebody wants to play with it).

From 28m10.310s to 22m27.724s when using 4 threads is a pretty small speedup.

Any idea what is slow? Is it just that each of the 1/4 sized chunks is still slow? Is the split not even in compile time?

What is the impact in output size. Even better if you have performance numbers of the linked program.

ELF/LTO.cpp
44	Drop the () around many.
143	Is this related to parallel LTO?

I agree I expected a bigger speedup. I'll investigate further (and post more precise numbers) ASAP.

ELF/LTO.cpp
143	you use the returned type as argument to codeine.

davide added inline comments.Apr 12 2016, 8:02 PM

ELF/LTO.cpp
143	*codegen.

Any idea what is slow? Is it just that each of the 1/4 sized chunks is still slow? Is the split not even in compile time?

Parallel LTO codegen is highly beholden to Amdahl's law. With parallel LTO codegen there is by default a large serial optimization phase at the start, which is basically the regular LTO optimization pipeline. The only part we parallelize is the backend. Davide's numbers seem about right if he was not using an --lto-O flag to reduce the opt level (which basically turns off most of the LTO-stage optimizers, so it's most useful if you're using something like -fsanitize=cfi).

Even at lower LTO opt levels, there's a significant amount of serial time spent splitting, serializing and deserializing the partitions. With debug info enabled, the debug info needs to be duplicated between the partitions as well. The best numbers we saw were a roughly 2x speedup with >4 threads, without debug info and at opt level 1.

What we found was that parallel LTO codegen is most useful when you don't mind throwing CPU and RAM at the problem of saving a moderately sized amount of time linking your program. It doesn't completely solve the scalability problem, which is why I've been working towards getting ThinLTO supported in LLD, as the scalability story is much better there.

ruiu added inline comments.Apr 13 2016, 9:08 AM

ELF/Driver.cpp
324	Move this before Config->Optimize to sort these lines.
ELF/LTO.cpp
42–43	Move this line just before `raw_fd_ostream OS...` so that the definition gets closer to use.
43	I'd spell this variable `Filename` as one word.
44	I found that appending an empty string is a bit confusing here. I'd probably do if (Many) Filename += utostr(I);
143	This function needs a comment.
146	Where does this `MAttrs` come from?
166	Why don't you just use Config->CGThreads as the number of threads?
185	Remove {}.
ELF/Options.td
241–242	I prefer --thread-count because gold has that option.
ELF/SymbolTable.cpp
132	Move this code before the enclosing for-loop. (Because the entire for-loop is to replace bitcode symbols.)

pcc added inline comments.Apr 13 2016, 10:14 AM

ELF/Options.td
241–242	Gold's `--thread-count` does not affect LTO. Furthermore, parallel LTO codegen can result in the linker producing different binaries based on the number of threads, and users of `--thread-count` probably wouldn't expect that option to change the binary. I think this should probably be something prefixed with `--lto-`.

davide added inline comments.Apr 13 2016, 10:16 AM

ELF/Options.td
241–242	That option has a different semantic. -jobs is what the plugin uses. If you prefer another name, I'm all for it, but I wouldn't clash.
ELF/SymbolTable.cpp
132	Which code in particular?

ruiu added inline comments.Apr 13 2016, 1:42 PM

ELF/Options.td
241–242	Then maybe --lto-thread-count or --lto-jobs?
ELF/SymbolTable.cpp
132	Sorry, I meant this comment.

I cced Eric who seems to think that it should be possible to avoid having multiple TargetMachines, which would be the perfect solution to this problem.

You should be able to avoid it, you just need to handle resetTargetOptions first (at least). You'll also need an optional lock on getSubtarget to deal with that. Those are the only problems I can think of off the top of my head.

-eric

What about the subtargets themselves?

Probably should check, but I don't -think- there's anything mutable there. If there is it should be fixed.

davide mentioned this in rL266390: [LTO] Add a new splitCodeGen() API which takes a TargetMachineFactory..Apr 14 2016, 5:13 PM

Apparently I and pcc replied at the same time.

In D18999#400612, @rafael wrote:

I cced Eric who seems to think that it should be possible to avoid having multiple TargetMachines, which would be the perfect solution to this problem.

So, we currently work around the problem with a Factory.
I opened https://llvm.org/bugs/show_bug.cgi?id=27361 so we don't lose track of this possible work (and I hope I or somebody else can get to it at some point)

So, we currently work around the problem with a Factory.

You mean we *can* work around it, right? In this change we still have
two code paths that create a TargetMachine.

What we can do to work around the problem if fixing pr27361 is too
much work for now is passing a std::function() with no arguments that
returns a std::unique_ptr<TargetMachine> to splitCodeGen.

That way we can use the same code to create a TargetMachine is both places.

Cheers,
Rafael

Addressed everybody's comments hopefully. The patch is smaller now.

ruiu added inline comments.Apr 15 2016, 12:40 PM

ELF/Config.h
97	Since this is for --lto-jobs, I'd name LtoJobs.
ELF/LTO.cpp
144	Instead of defining NumThreads, use Config->LtoJobs directly.
155	Does [&]() { return getTargetMachine; } work?

mehdi_amini added inline comments.Apr 15 2016, 1:02 PM

ELF/LTO.cpp
155	Rather: `[this] () { return getTargetMachine(); }`

davide updated this revision to Diff 53942.Apr 15 2016, 1:37 PM

davide marked 3 inline comments as done.

davide added inline comments.

ELF/LTO.cpp
144	No, NumThreads is also used later in the function.

This is looking fine, but I'll let someone who's working on LTO to sign off.

Please add a test showing that we can link a program with 2 or more partitions.

Add a test as requested by Peter.

LGTM

This revision is now accepted and ready to land.Apr 15 2016, 3:37 PM

Committed thusly (r266484)

Diff 53960

ELF/Config.h

Show First 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	struct Configuration {
bool ZNodelete;		bool ZNodelete;
bool ZNow;		bool ZNow;
bool ZOrigin;		bool ZOrigin;
bool ZRelro;		bool ZRelro;
BuildIdKind BuildId = BuildIdKind::None;		BuildIdKind BuildId = BuildIdKind::None;
ELFKind EKind = ELFNoneKind;		ELFKind EKind = ELFNoneKind;
uint16_t EMachine = llvm::ELF::EM_NONE;		uint16_t EMachine = llvm::ELF::EM_NONE;
uint64_t EntryAddr = -1;		uint64_t EntryAddr = -1;
		unsigned LtoJobs;
		ruiuUnsubmitted Done Reply Inline Actions Since this is for --lto-jobs, I'd name LtoJobs. ruiu: Since this is for --lto-jobs, I'd name LtoJobs.
unsigned LtoO;		unsigned LtoO;
unsigned Optimize;		unsigned Optimize;
};		};

// The only instance of Configuration struct.		// The only instance of Configuration struct.
extern Configuration *Config;		extern Configuration *Config;

} // namespace elf		} // namespace elf
} // namespace lld		} // namespace lld

#endif		#endif

ELF/Driver.cpp

Show First 20 Lines • Show All 315 Lines • ▼ Show 20 Lines	void LinkerDriver::readConfigs(opt::InputArgList &Args) {
Config->OutputFile = getString(Args, OPT_o);		Config->OutputFile = getString(Args, OPT_o);
Config->SoName = getString(Args, OPT_soname);		Config->SoName = getString(Args, OPT_soname);
Config->Sysroot = getString(Args, OPT_sysroot);		Config->Sysroot = getString(Args, OPT_sysroot);

Config->Optimize = getInteger(Args, OPT_O, 0);		Config->Optimize = getInteger(Args, OPT_O, 0);
Config->LtoO = getInteger(Args, OPT_lto_O, 2);		Config->LtoO = getInteger(Args, OPT_lto_O, 2);
if (Config->LtoO > 3)		if (Config->LtoO > 3)
error("invalid optimization level for LTO: " + getString(Args, OPT_lto_O));		error("invalid optimization level for LTO: " + getString(Args, OPT_lto_O));
		Config->LtoJobs = getInteger(Args, OPT_lto_jobs, 1);
		ruiuUnsubmitted Not Done Reply Inline Actions Move this before Config->Optimize to sort these lines. ruiu: Move this before Config->Optimize to sort these lines.
		if (Config->LtoJobs == 0)
		error("number of threads must be > 0");

Config->ZExecStack = hasZOption(Args, "execstack");		Config->ZExecStack = hasZOption(Args, "execstack");
Config->ZNodelete = hasZOption(Args, "nodelete");		Config->ZNodelete = hasZOption(Args, "nodelete");
Config->ZNow = hasZOption(Args, "now");		Config->ZNow = hasZOption(Args, "now");
Config->ZOrigin = hasZOption(Args, "origin");		Config->ZOrigin = hasZOption(Args, "origin");
Config->ZRelro = !hasZOption(Args, "norelro");		Config->ZRelro = !hasZOption(Args, "norelro");

if (Config->Relocatable)		if (Config->Relocatable)
▲ Show 20 Lines • Show All 131 Lines • Show Last 20 Lines

ELF/LTO.h

	Show All 31 Lines
	namespace elf {			namespace elf {

	class BitcodeFile;			class BitcodeFile;
	class InputFile;			class InputFile;

	class BitcodeCompiler {			class BitcodeCompiler {
	public:			public:
	void add(BitcodeFile &F);			void add(BitcodeFile &F);
	std::unique_ptr<InputFile> compile();			std::vector<std::unique_ptr<InputFile>> compile();

	BitcodeCompiler()			BitcodeCompiler()
	: Combined(new llvm::Module("ld-temp.o", Context)), Mover(*Combined) {}			: Combined(new llvm::Module("ld-temp.o", Context)), Mover(*Combined) {}

	private:			private:
	llvm::TargetMachine *getTargetMachine();			std::vector<std::unique_ptr<InputFile>> runSplitCodegen();
				std::unique_ptr<llvm::TargetMachine> getTargetMachine();

	llvm::LLVMContext Context;			llvm::LLVMContext Context;
	std::unique_ptr<llvm::Module> Combined;			std::unique_ptr<llvm::Module> Combined;
	llvm::IRMover Mover;			llvm::IRMover Mover;
	SmallString<0> OwningData;			std::vector<SmallString<0>> OwningData;
	std::unique_ptr<MemoryBuffer> MB;			std::unique_ptr<MemoryBuffer> MB;
	llvm::StringSet<> InternalizedSyms;			llvm::StringSet<> InternalizedSyms;
				std::string TheTriple;
	};			};
	}			}
	}			}

	#endif			#endif

ELF/LTO.cpp

	Show All 10 Lines
	#include "Config.h"			#include "Config.h"
	#include "Error.h"			#include "Error.h"
	#include "InputFiles.h"			#include "InputFiles.h"
	#include "Symbols.h"			#include "Symbols.h"
	#include "llvm/Analysis/TargetLibraryInfo.h"			#include "llvm/Analysis/TargetLibraryInfo.h"
	#include "llvm/Analysis/TargetTransformInfo.h"			#include "llvm/Analysis/TargetTransformInfo.h"
	#include "llvm/Bitcode/ReaderWriter.h"			#include "llvm/Bitcode/ReaderWriter.h"
	#include "llvm/CodeGen/CommandFlags.h"			#include "llvm/CodeGen/CommandFlags.h"
				#include "llvm/CodeGen/ParallelCG.h"
	#include "llvm/IR/LegacyPassManager.h"			#include "llvm/IR/LegacyPassManager.h"
	#include "llvm/Linker/IRMover.h"			#include "llvm/Linker/IRMover.h"
	#include "llvm/Support/StringSaver.h"			#include "llvm/Support/StringSaver.h"
	#include "llvm/Support/TargetRegistry.h"			#include "llvm/Support/TargetRegistry.h"
	#include "llvm/Target/TargetMachine.h"			#include "llvm/Target/TargetMachine.h"
	#include "llvm/Transforms/IPO.h"			#include "llvm/Transforms/IPO.h"
	#include "llvm/Transforms/Utils/ModuleUtils.h"			#include "llvm/Transforms/Utils/ModuleUtils.h"
	#include "llvm/Transforms/IPO/PassManagerBuilder.h"			#include "llvm/Transforms/IPO/PassManagerBuilder.h"

	using namespace llvm;			using namespace llvm;
	using namespace llvm::object;			using namespace llvm::object;
	using namespace llvm::ELF;			using namespace llvm::ELF;

	using namespace lld;			using namespace lld;
	using namespace lld::elf;			using namespace lld::elf;

	// This is for use when debugging LTO.			// This is for use when debugging LTO.
	static void saveLtoObjectFile(StringRef Buffer) {			static void saveLtoObjectFile(StringRef Buffer, unsigned I, bool Many) {
				SmallString<128> Filename = Config->OutputFile;
				if (Many)
				Filename += utostr(I);
				Filename += ".lto.o";
	std::error_code EC;			std::error_code EC;
	raw_fd_ostream OS(Config->OutputFile.str() + ".lto.o", EC,			raw_fd_ostream OS(Filename, EC, sys::fs::OpenFlags::F_None);
				ruiuUnsubmitted Not Done Reply Inline Actions I'd spell this variable `Filename` as one word. ruiu: I'd spell this variable `Filename` as one word.
				ruiuUnsubmitted Not Done Reply Inline Actions Move this line just before `raw_fd_ostream OS...` so that the definition gets closer to use. ruiu: Move this line just before `raw_fd_ostream OS...` so that the definition gets closer to use.
	sys::fs::OpenFlags::F_None);
	check(EC);			check(EC);
				rafaelUnsubmitted Not Done Reply Inline Actions Drop the () around many. rafael: Drop the () around many.
				ruiuUnsubmitted Not Done Reply Inline Actions I found that appending an empty string is a bit confusing here. I'd probably do if (Many) Filename += utostr(I); ruiu: I found that appending an empty string is a bit confusing here. I'd probably do if (Many)…
	OS << Buffer;			OS << Buffer;
	}			}

	// This is for use when debugging LTO.			// This is for use when debugging LTO.
	static void saveBCFile(Module &M, StringRef Suffix) {			static void saveBCFile(Module &M, StringRef Suffix) {
	std::error_code EC;			std::error_code EC;
	raw_fd_ostream OS(Config->OutputFile.str() + Suffix.str(), EC,			raw_fd_ostream OS(Config->OutputFile.str() + Suffix.str(), EC,
	sys::fs::OpenFlags::F_None);			sys::fs::OpenFlags::F_None);
	▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines
	}			}

	static void internalize(GlobalValue &GV) {			static void internalize(GlobalValue &GV) {
	assert(!GV.hasLocalLinkage() &&			assert(!GV.hasLocalLinkage() &&
	"Trying to internalize a symbol with local linkage!");			"Trying to internalize a symbol with local linkage!");
	GV.setLinkage(GlobalValue::InternalLinkage);			GV.setLinkage(GlobalValue::InternalLinkage);
	}			}

				std::vector<std::unique_ptr<InputFile>> BitcodeCompiler::runSplitCodegen() {
				rafaelUnsubmitted Not Done Reply Inline Actions Is this related to parallel LTO? rafael: Is this related to parallel LTO?
				davideAuthorUnsubmitted Not Done Reply Inline Actions you use the returned type as argument to codeine. davide: you use the returned type as argument to codeine.
				davideAuthorUnsubmitted Not Done Reply Inline Actions codegen. davide:* *codegen.
				ruiuUnsubmitted Not Done Reply Inline Actions This function needs a comment. ruiu: This function needs a comment.
				unsigned NumThreads = Config->LtoJobs;
				ruiuUnsubmitted Not Done Reply Inline Actions Instead of defining NumThreads, use Config->LtoJobs directly. ruiu: Instead of defining NumThreads, use Config->LtoJobs directly.
				davideAuthorUnsubmitted Not Done Reply Inline Actions No, NumThreads is also used later in the function. davide: No, NumThreads is also used later in the function.
				OwningData.resize(NumThreads);

				ruiuUnsubmitted Not Done Reply Inline Actions Where does this `MAttrs` come from? ruiu: Where does this `MAttrs` come from?
				std::list<raw_svector_ostream> OSs;
				std::vector<raw_pwrite_stream *> OSPtrs;
				for (SmallString<0> &Obj : OwningData) {
				OSs.emplace_back(Obj);
				OSPtrs.push_back(&OSs.back());
				}

				splitCodeGen(std::move(Combined), OSPtrs, {},
				[this]() { return getTargetMachine(); });
				ruiuUnsubmitted Done Reply Inline Actions Does [&]() { return getTargetMachine; } work? ruiu: Does [&]() { return getTargetMachine; } work?
				mehdi_aminiUnsubmitted Done Reply Inline Actions Rather: `[this] () { return getTargetMachine(); }` mehdi_amini: Rather: `[this] () { return getTargetMachine(); }`

				std::vector<std::unique_ptr<InputFile>> ObjFiles;
				for (SmallString<0> &Obj : OwningData)
				ObjFiles.push_back(createObjectFile(
				MemoryBufferRef(Obj, "LLD-INTERNAL-combined-lto-object")));

				if (Config->SaveTemps)
				for (unsigned I = 0; I < NumThreads; ++I)
				saveLtoObjectFile(OwningData[I], I, NumThreads > 1);

				return ObjFiles;
				ruiuUnsubmitted Not Done Reply Inline Actions Because NumThreads is 1, nothing would run in parallel? ruiu: Because NumThreads is 1, nothing would run in parallel?
				davideAuthorUnsubmitted Not Done Reply Inline Actions Yes, I have another patch to add an option to specify the number of threads. I put this up to review to understand if the logic is sound. I'm collecting numbers in multiple configurations (1,2,3,4 threads). The previous comments in the review have some very early numbers using 4 threads. Cheers, Davide davide: Yes, I have another patch to add an option to specify the number of threads. I put this up to…
				ruiuUnsubmitted Not Done Reply Inline Actions Why don't you just use Config->CGThreads as the number of threads? ruiu: Why don't you just use Config->CGThreads as the number of threads?
				}

	// Merge all the bitcode files we have seen, codegen the result			// Merge all the bitcode files we have seen, codegen the result
	// and return the resulting ObjectFile.			// and return the resulting ObjectFile.
	std::unique_ptr<InputFile> BitcodeCompiler::compile() {			std::vector<std::unique_ptr<InputFile>> BitcodeCompiler::compile() {
				TheTriple = Combined->getTargetTriple();
	for (const auto &Name : InternalizedSyms) {			for (const auto &Name : InternalizedSyms) {
	GlobalValue *GV = Combined->getNamedValue(Name.first());			GlobalValue *GV = Combined->getNamedValue(Name.first());
	assert(GV);			assert(GV);
	internalize(*GV);			internalize(*GV);
	}			}

	if (Config->SaveTemps)			if (Config->SaveTemps)
	saveBCFile(*Combined, ".lto.bc");			saveBCFile(*Combined, ".lto.bc");

	std::unique_ptr<TargetMachine> TM(getTargetMachine());			std::unique_ptr<TargetMachine> TM(getTargetMachine());
	runLTOPasses(Combined, TM);			runLTOPasses(Combined, TM);

	raw_svector_ostream OS(OwningData);			return runSplitCodegen();
				ruiuUnsubmitted Not Done Reply Inline Actions Remove {}. ruiu: Remove {}.
	legacy::PassManager CodeGenPasses;
	if (TM->addPassesToEmitFile(CodeGenPasses, OS,
	TargetMachine::CGFT_ObjectFile))
	fatal("failed to setup codegen");
	CodeGenPasses.run(*Combined);
	MB = MemoryBuffer::getMemBuffer(OwningData,
	"LLD-INTERNAL-combined-lto-object", false);
	if (Config->SaveTemps)
	saveLtoObjectFile(MB->getBuffer());
	return createObjectFile(*MB);
	}			}

	TargetMachine *BitcodeCompiler::getTargetMachine() {			std::unique_ptr<TargetMachine> BitcodeCompiler::getTargetMachine() {
	StringRef TripleStr = Combined->getTargetTriple();
	std::string Msg;			std::string Msg;
	const Target *T = TargetRegistry::lookupTarget(TripleStr, Msg);			const Target *T = TargetRegistry::lookupTarget(TheTriple, Msg);
	if (!T)			if (!T)
	fatal("target not found: " + Msg);			fatal("target not found: " + Msg);
	TargetOptions Options = InitTargetOptionsFromCodeGenFlags();			TargetOptions Options = InitTargetOptionsFromCodeGenFlags();
	Reloc::Model R = Config->Pic ? Reloc::PIC_ : Reloc::Static;			Reloc::Model R = Config->Pic ? Reloc::PIC_ : Reloc::Static;
	return T->createTargetMachine(TripleStr, "", "", Options, R);			return std::unique_ptr<TargetMachine>(
				T->createTargetMachine(TheTriple, "", "", Options, R));
	}			}

ELF/Options.td

	Show First 20 Lines • Show All 231 Lines • ▼ Show 20 Lines
	def version_script : Separate<["--"], "version-script">;			def version_script : Separate<["--"], "version-script">;
	def warn_execstack : Flag<["--"], "warn-execstack">;			def warn_execstack : Flag<["--"], "warn-execstack">;
	def warn_shared_textrel : Flag<["--"], "warn-shared-textrel">;			def warn_shared_textrel : Flag<["--"], "warn-shared-textrel">;
	def G : Separate<["-"], "G">;			def G : Separate<["-"], "G">;

	// Aliases for ignored options			// Aliases for ignored options
	def alias_version_script_version_script : Joined<["--"], "version-script=">, Alias<version_script>;			def alias_version_script_version_script : Joined<["--"], "version-script=">, Alias<version_script>;

	// Debugging/developer options			// LTO-related options.
				def lto_jobs : Joined<["--"], "lto-jobs=">,
				HelpText<"Number of threads to run codegen">;
				ruiuUnsubmitted Not Done Reply Inline Actions I prefer --thread-count because gold has that option. ruiu: I prefer --thread-count because gold has that option.
				pccUnsubmitted Not Done Reply Inline Actions Gold's `--thread-count` does not affect LTO. Furthermore, parallel LTO codegen can result in the linker producing different binaries based on the number of threads, and users of `--thread-count` probably wouldn't expect that option to change the binary. I think this should probably be something prefixed with `--lto-`. pcc: Gold's `--thread-count` does not affect LTO. Furthermore, parallel LTO codegen can result in…
				davideAuthorUnsubmitted Not Done Reply Inline Actions That option has a different semantic. -jobs is what the plugin uses. If you prefer another name, I'm all for it, but I wouldn't clash. davide: That option has a different semantic. -jobs is what the plugin uses. If you prefer another name…
				ruiuUnsubmitted Not Done Reply Inline Actions Then maybe --lto-thread-count or --lto-jobs? ruiu: Then maybe --lto-thread-count or --lto-jobs?
	def disable_verify : Flag<["-"], "disable-verify">;			def disable_verify : Flag<["-"], "disable-verify">;
	def mllvm : Separate<["-"], "mllvm">;			def mllvm : Separate<["-"], "mllvm">;
	def save_temps : Flag<["-"], "save-temps">;			def save_temps : Flag<["-"], "save-temps">;

ELF/SymbolTable.cpp

	Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
	template <class ELFT> void SymbolTable<ELFT>::addCombinedLtoObject() {			template <class ELFT> void SymbolTable<ELFT>::addCombinedLtoObject() {
	if (BitcodeFiles.empty())			if (BitcodeFiles.empty())
	return;			return;

	// Compile bitcode files.			// Compile bitcode files.
	Lto.reset(new BitcodeCompiler);			Lto.reset(new BitcodeCompiler);
	for (const std::unique_ptr<BitcodeFile> &F : BitcodeFiles)			for (const std::unique_ptr<BitcodeFile> &F : BitcodeFiles)
	Lto->add(*F);			Lto->add(*F);
	std::unique_ptr<InputFile> IF = Lto->compile();			std::vector<std::unique_ptr<InputFile>> IFs = Lto->compile();
	ObjectFile<ELFT> *Obj = cast<ObjectFile<ELFT>>(IF.release());

	// Replace bitcode symbols.			// Replace bitcode symbols.
				for (auto &IF : IFs) {
				ObjectFile<ELFT> *Obj = cast<ObjectFile<ELFT>>(IF.release());

	llvm::DenseSet<StringRef> DummyGroups;			llvm::DenseSet<StringRef> DummyGroups;
	Obj->parse(DummyGroups);			Obj->parse(DummyGroups);
				ruiuUnsubmitted Not Done Reply Inline Actions Move this code before the enclosing for-loop. (Because the entire for-loop is to replace bitcode symbols.) ruiu: Move this code before the enclosing for-loop. (Because the entire for-loop is to replace…
				davideAuthorUnsubmitted Not Done Reply Inline Actions Which code in particular? davide: Which code in particular?
				ruiuUnsubmitted Not Done Reply Inline Actions Sorry, I meant this comment. ruiu: Sorry, I meant this comment.
	for (SymbolBody *Body : Obj->getNonLocalSymbols()) {			for (SymbolBody *Body : Obj->getNonLocalSymbols()) {
	Symbol *Sym = insert(Body);			Symbol *Sym = insert(Body);
	Sym->Body->setUsedInRegularObj();			Sym->Body->setUsedInRegularObj();
	if (Sym->Body->isShared())			if (Sym->Body->isShared())
	Sym->Body->MustBeInDynSym = true;			Sym->Body->MustBeInDynSym = true;
	if (Sym->Body->MustBeInDynSym)			if (Sym->Body->MustBeInDynSym)
	Body->MustBeInDynSym = true;			Body->MustBeInDynSym = true;
	if (!Sym->Body->isUndefined() && Body->isUndefined())			if (!Sym->Body->isUndefined() && Body->isUndefined())
	continue;			continue;
	Sym->Body = Body;			Sym->Body = Body;
	}			}
	ObjectFiles.emplace_back(Obj);			ObjectFiles.emplace_back(Obj);
	}			}
				}

	// Add an undefined symbol.			// Add an undefined symbol.
	template <class ELFT>			template <class ELFT>
	SymbolBody *SymbolTable<ELFT>::addUndefined(StringRef Name) {			SymbolBody *SymbolTable<ELFT>::addUndefined(StringRef Name) {
	auto *Sym = new (Alloc)			auto *Sym = new (Alloc)
	UndefinedElf<ELFT>(Name, STB_GLOBAL, STV_DEFAULT, /Type/ 0, false);			UndefinedElf<ELFT>(Name, STB_GLOBAL, STV_DEFAULT, /Type/ 0, false);
	resolve(Sym);			resolve(Sym);
	return Sym;			return Sym;
	▲ Show 20 Lines • Show All 208 Lines • Show Last 20 Lines

test/ELF/basic.s

	Show First 20 Lines • Show All 208 Lines • ▼ Show 20 Lines
	# UNKNOWN: unknown argument: --foo			# UNKNOWN: unknown argument: --foo

	# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t			# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t
	# RUN: not ld.lld %t %t -o %t2 2>&1 \| FileCheck --check-prefix=DUP %s			# RUN: not ld.lld %t %t -o %t2 2>&1 \| FileCheck --check-prefix=DUP %s
	# DUP: duplicate symbol: _start in {{.}} and {{.}}			# DUP: duplicate symbol: _start in {{.}} and {{.}}

	# RUN: not ld.lld %t -o %t -m wrong_emul 2>&1 \| FileCheck --check-prefix=UNKNOWN_EMUL %s			# RUN: not ld.lld %t -o %t -m wrong_emul 2>&1 \| FileCheck --check-prefix=UNKNOWN_EMUL %s
	# UNKNOWN_EMUL: unknown emulation: wrong_emul			# UNKNOWN_EMUL: unknown emulation: wrong_emul

				# RUN: not ld.lld %t --lto-jobs=0 2>&1 \| FileCheck --check-prefix=NOTHREADS %s
				# NOTHREADS: number of threads must be > 0

test/ELF/lto/parallel.ll

This file was added.

				; RUN: llvm-as -o %t.bc %s
				; RUN: ld.lld -m elf_x86_64 --lto-jobs=2 -save-temps -o %t %t.bc -shared
				; RUN: llvm-nm %t0.lto.o \| FileCheck --check-prefix=CHECK0 %s
				; RUN: llvm-nm %t1.lto.o \| FileCheck --check-prefix=CHECK1 %s

				target triple = "x86_64-unknown-linux-gnu"

				; CHECK0-NOT: bar
				; CHECK0: T foo
				; CHECK0-NOT: bar
				define void @foo() {
				call void @bar()
				ret void
				}

				; CHECK1-NOT: foo
				; CHECK1: T bar
				; CHECK1-NOT: foo
				define void @bar() {
				call void @foo()
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[ELF/LTO] Parallel Codegen for LLD
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 53960

ELF/Config.h

ELF/Driver.cpp

ELF/LTO.h

ELF/LTO.cpp

ELF/Options.td

ELF/SymbolTable.cpp

test/ELF/basic.s

test/ELF/lto/parallel.ll

This is an archive of the discontinued LLVM Phabricator instance.

[ELF/LTO] Parallel Codegen for LLDClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 53960

ELF/Config.h

ELF/Driver.cpp

ELF/LTO.h

ELF/LTO.cpp

ELF/Options.td

ELF/SymbolTable.cpp

test/ELF/basic.s

test/ELF/lto/parallel.ll

[ELF/LTO] Parallel Codegen for LLD
ClosedPublic