This is an archive of the discontinued LLVM Phabricator instance.

clang/lib/CodeGen/BackendUtil.cpp
771	The error message could be a bit more specific here.
775	We load the first BitcodeModule, should we assert that this is the only one? How do we know this is the right one?
llvm/include/llvm/LTO/LTO.h
224	Since we have the two, it seems worth documenting these.
424	This is a pretty annoying API with ResI as an InOut parameter: at least document it.
llvm/lib/LTO/LTO.cpp
220	Document.
223	Add a comment to explain why it is out-of-line.
241–243	Some high level one or two sentences commenting what is going to happen below in the loop would be welcome I think.
253	Is there an expectation that no symbol can be duplicated in the various modules?
292	You can create an InputFile where we wouldn't find any module, if you don't want to support that we should detect it in the `create()` and return an error there. Otherwise this is UB to be called on a valid created `InputFile`
318	Add a comment to explain why it is out-of-line.
500–504	We'll arrive here from `for (InputFile::InputModule &IM : Input->Mods)` in `LTO::add()`, how are we distinguishing `BM.getModuleIdentifier()` when there are multiple modules per file? I assume we don't expect multiple modules with a summary in the same input file, but in this case we should check here that the module has been inserted and error otherwise.

pcc marked 10 inline comments as done.Dec 13 2016, 4:03 PM

pcc added inline comments.

clang/lib/CodeGen/BackendUtil.cpp
775	I added some code that makes sure that we load the right module.
llvm/lib/LTO/LTO.cpp
253	If you mean that the modules may not both contain a definition of the symbol, then yes. The entity creating these bitcode files is expected to adhere to this policy. (The same applies to definitions of comdats, so the code on lines 215-221 below remains the same.) It's possible to have a defined symbol in one module and a duplicate undefined symbol with the same name in another module; the situation is similar to module inline asm right now (i.e. PR30396). Whatever solution we come up with for that problem should also apply to this one.
292	Added a check to `create()`.
500–504	`BM.getModuleIdentifier()` will be the same for each `BitcodeModule` in the same input file by construction, so we can just test the return value of `insert` here. (I feel obliged to point out that this is yet another scenario where (path, byte slice) pairs would work better as a representation of ThinLTO modules.)

Address review comments

mehdi_amini added inline comments.Dec 13 2016, 4:49 PM

clang/lib/CodeGen/BackendUtil.cpp
783	What if `I.first` was already present in the map? It seems the same case as below for ThinLTO where we should check that the returned value of `insert().second` .
llvm/lib/LTO/LTO.cpp
500–504	What is the other scenario? I don't remember what was specific to ThinLTO for this? Also a pair `path/byte slice` doesn't fit the API in general, does it? We don't necessarily have a "file". That said, instead of path the identifier could be a `<BufferID, offset>`, with `BufferID` intended to be a unique identifier provided by the client of the API per-buffer. I'm not sure how deep we'd have to thread this though. I think that as long as when we create a llvm::Module the identifier is consistent with whatever is used here, everything should be OK.

LGTM otherwise.

This revision is now accepted and ready to land.Dec 13 2016, 4:49 PM

pcc added inline comments.Dec 13 2016, 5:16 PM

clang/lib/CodeGen/BackendUtil.cpp
783	That doesn't seem like it can happen. The range for loop starting on line 758 enumerates elements of a map, so each `I.first()` will be unique. We also break out of this loop as soon as we see a module with a summary.
llvm/lib/LTO/LTO.cpp
500–504	What is the other scenario? I don't remember what was specific to ThinLTO for this? The main other scenario is (non-thin) archive files, which is already a problem as we don't handle that correctly in the distributed case, and we have workarounds in both the gold plugin and lld to give `-save-temps` temporaries appropriate names. This is ThinLTO-specific because we don't create temporaries for each input file under regular LTO. Also a pair path/byte slice doesn't fit the API in general, does it? We don't necessarily have a "file". True, but it's also the case that we don't necessarily have a "file" now and we're already passing paths around. I imagine that clients which don't use files could just make something up for the offsets as they would make something up for file paths. That said, instead of path the identifier could be a <BufferID, offset>, with BufferID intended to be a unique identifier provided by the client of the API per-buffer. I think that could work as well.

Closed by commit rL289621: LTO: Add support for multi-module bitcode files. (authored by pcc). · Explain WhyDec 13 2016, 5:28 PM

This revision was automatically updated to reflect the committed changes.

mehdi_amini added inline comments.Dec 13 2016, 5:37 PM

llvm/lib/LTO/LTO.cpp
500–504	In the C API (and thus in ld64), the expectation is that every input buffer has to be provided with a unique id. It happens that this ID is usually the path on file (with `libFoo.a(member.o)` when a static archive is involved). Because the linker is supplying one member at a time, the unique ID in the API is enough. Having multiple modules per file is new though and would break this "unique ID" provided by the linker, we could suffix it when loading each individual bitcode/module though. The --save-temps is always using integer right now I believe (files are 1.thin.o, 2.thin.o, etc.). True, but it's also the case that we don't necessarily have a "file" now and we're already passing paths around. That's just us being inconsistent, I tried to express "ModuleID" instead of path as much as possible.

pcc added a subscriber: davide.Dec 13 2016, 5:51 PM

pcc added inline comments.

llvm/lib/LTO/LTO.cpp
500–504	Because the linker is supplying one member at a time, the unique ID in the API is enough. I think @davide had a case where an archive had two members with the same name, and they were only distinguishable by offset. The --save-temps is always using integer right now I believe (files are 1.thin.o, 2.thin.o, etc.). Not always: http://llvm-cs.pcc.me.uk/lib/LTO/LTOBackend.cpp#77 gold and lld both pass `UseInputModulePath == true` here. We probably want to overhaul how the filenames look for `save-temps`, but I think in the end they should contain the module ID in some form.

Yes, thin-archivecollision.ll in lld/test/lto is an example, but it could be even worse because the two members can be part of the same archive. I find it weird that's possible to create an archive with two distinct members and the same name, but apparently ar consider this a legitimate operation.

mehdi_amini added inline comments.Dec 13 2016, 6:00 PM

llvm/lib/LTO/LTO.cpp
500–504	I think @davide had a case where an archive had two members with the same name, and they were only distinguishable by offset I remember this, I wouldn't be too much concerned here with this and push to the client the responsibility to provide unique IDs. (Nothing prevent the linker to build an id that always include the offset for instance)

pcc added inline comments.Dec 13 2016, 6:16 PM

llvm/lib/LTO/LTO.cpp
500–504	Well, in the distributed case some component needs to understand the module IDs in order to distribute the work properly. Part of that may involve translating the module IDs into "file paths" (which may be actual file paths or conceptual ones). If we can arrange to use the same ("file path", offset) scheme in all linkers, that component can be shared between linkers. But to a certain extent this is all hypothetical, I think I'd want to prototype before being sure that this is the right design.

Revision Contents

Path

Size

clang/

lib/

CodeGen/

BackendUtil.cpp

31 lines

test/

CodeGen/

thinlto_backend.ll

9 lines

llvm/

include/

llvm/

Bitcode/

BitcodeReader.h

2 lines

LTO/

LTO.h

35 lines

LTOBackend.h

3 lines

lib/

LTO/

LTO.cpp

170 lines

LTOBackend.cpp

8 lines

test/

LTO/

Resolution/

X86/

empty-bitcode.test

3 lines

mixed_lto.ll

6 lines

multi-thinlto.ll

6 lines

Diff 81322

clang/lib/CodeGen/BackendUtil.cpp

Show First 20 Lines • Show All 747 Lines • ▼ Show 20 Lines	static void runThinLTOBackend(const CodeGenOptions &CGOpts, Module *M,

// FIXME: We could simply import the modules mentioned in the combined index		// FIXME: We could simply import the modules mentioned in the combined index
// here.		// here.
FunctionImporter::ImportMapTy ImportList;		FunctionImporter::ImportMapTy ImportList;
ComputeCrossModuleImportForModule(M->getModuleIdentifier(), *CombinedIndex,		ComputeCrossModuleImportForModule(M->getModuleIdentifier(), *CombinedIndex,
ImportList);		ImportList);

std::vector<std::unique_ptr<llvm::MemoryBuffer>> OwnedImports;		std::vector<std::unique_ptr<llvm::MemoryBuffer>> OwnedImports;
MapVector<llvm::StringRef, llvm::MemoryBufferRef> ModuleMap;		MapVector<llvm::StringRef, llvm::BitcodeModule> ModuleMap;

for (auto &I : ImportList) {		for (auto &I : ImportList) {
ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> MBOrErr =		ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> MBOrErr =
llvm::MemoryBuffer::getFile(I.first());		llvm::MemoryBuffer::getFile(I.first());
if (!MBOrErr) {		if (!MBOrErr) {
errs() << "Error loading imported file '" << I.first()		errs() << "Error loading imported file '" << I.first()
<< "': " << MBOrErr.getError().message() << "\n";		<< "': " << MBOrErr.getError().message() << "\n";
return;		return;
}		}
ModuleMap[I.first()] = (*MBOrErr)->getMemBufferRef();
		Expected<std::vector<BitcodeModule>> BMsOrErr =
		getBitcodeModuleList(**MBOrErr);
		if (!BMsOrErr) {
		handleAllErrors(BMsOrErr.takeError(), [&](ErrorInfoBase &EIB) {
		errs() << "Error loading imported file '" << I.first()
		mehdi_aminiUnsubmitted Done Reply Inline Actions The error message could be a bit more specific here. mehdi_amini: The error message could be a bit more specific here.
		<< "': " << EIB.message() << '\n';
		});
		return;
		}
		mehdi_aminiUnsubmitted Done Reply Inline Actions We load the first BitcodeModule, should we assert that this is the only one? How do we know this is the right one? mehdi_amini: We load the first BitcodeModule, should we assert that this is the only one? How do we know…
		pccAuthorUnsubmitted Not Done Reply Inline Actions I added some code that makes sure that we load the right module. pcc: I added some code that makes sure that we load the right module.

		// The bitcode file may contain multiple modules, we want the one with a
		// summary.
		bool FoundModule = false;
		for (BitcodeModule &BM : *BMsOrErr) {
		Expected<bool> HasSummary = BM.hasSummary();
		if (HasSummary && *HasSummary) {
		ModuleMap.insert({I.first(), BM});
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions What if `I.first` was already present in the map? It seems the same case as below for ThinLTO where we should check that the returned value of `insert().second` . mehdi_amini: What if `I.first` was already present in the map? It seems the same case as below for ThinLTO…
		pccAuthorUnsubmitted Not Done Reply Inline Actions That doesn't seem like it can happen. The range for loop starting on line 758 enumerates elements of a map, so each `I.first()` will be unique. We also break out of this loop as soon as we see a module with a summary. pcc: That doesn't seem like it can happen. The range for loop starting on line 758 enumerates…
		FoundModule = true;
		break;
		}
		}
		if (!FoundModule) {
		errs() << "Error loading imported file '" << I.first()
		<< "': Could not find module summary\n";
		return;
		}

OwnedImports.push_back(std::move(*MBOrErr));		OwnedImports.push_back(std::move(*MBOrErr));
}		}
auto AddStream = [&](size_t Task) {		auto AddStream = [&](size_t Task) {
return llvm::make_unique<lto::NativeObjectStream>(std::move(OS));		return llvm::make_unique<lto::NativeObjectStream>(std::move(OS));
};		};
lto::Config Conf;		lto::Config Conf;
if (Error E = thinBackend(		if (Error E = thinBackend(
Conf, 0, AddStream, M, CombinedIndex, ImportList,		Conf, 0, AddStream, M, CombinedIndex, ImportList,
▲ Show 20 Lines • Show All 153 Lines • Show Last 20 Lines

clang/test/CodeGen/thinlto_backend.ll

	; REQUIRES: x86-registered-target			; REQUIRES: x86-registered-target

	; RUN: opt -module-summary -o %t1.o %s			; RUN: opt -module-summary -o %t1.o %s
	; RUN: opt -module-summary -o %t2.o %S/Inputs/thinlto_backend.ll			; RUN: opt -module-summary -o %t2.o %S/Inputs/thinlto_backend.ll
	; RUN: llvm-lto -thinlto -o %t %t1.o %t2.o			; RUN: llvm-lto -thinlto -o %t %t1.o %t2.o

	; Ensure clang -cc1 give expected error for incorrect input type			; Ensure clang -cc1 give expected error for incorrect input type
	; RUN: not %clang_cc1 -O2 -o %t1.o -x c %s -c -fthinlto-index=%t.thinlto.bc 2>&1 \| FileCheck %s -check-prefix=CHECK-WARNING			; RUN: not %clang_cc1 -O2 -o %t1.o -x c %s -c -fthinlto-index=%t.thinlto.bc 2>&1 \| FileCheck %s -check-prefix=CHECK-WARNING
	; CHECK-WARNING: error: invalid argument '-fthinlto-index={{.*}}' only allowed with '-x ir'			; CHECK-WARNING: error: invalid argument '-fthinlto-index={{.*}}' only allowed with '-x ir'

	; Ensure we get expected error for missing index file			; Ensure we get expected error for missing index file
	; RUN: %clang -O2 -o %t3.o -x ir %t1.o -c -fthinlto-index=bad.thinlto.bc 2>&1 \| FileCheck %s -check-prefix=CHECK-ERROR			; RUN: %clang -O2 -o %t4.o -x ir %t1.o -c -fthinlto-index=bad.thinlto.bc 2>&1 \| FileCheck %s -check-prefix=CHECK-ERROR1
	; CHECK-ERROR: Error loading index file 'bad.thinlto.bc'			; CHECK-ERROR1: Error loading index file 'bad.thinlto.bc'

	; Ensure f2 was imported			; Ensure f2 was imported
	; RUN: %clang -target x86_64-unknown-linux-gnu -O2 -o %t3.o -x ir %t1.o -c -fthinlto-index=%t.thinlto.bc			; RUN: %clang -target x86_64-unknown-linux-gnu -O2 -o %t3.o -x ir %t1.o -c -fthinlto-index=%t.thinlto.bc
	; RUN: llvm-nm %t3.o \| FileCheck --check-prefix=CHECK-OBJ %s			; RUN: llvm-nm %t3.o \| FileCheck --check-prefix=CHECK-OBJ %s
	; CHECK-OBJ: T f1			; CHECK-OBJ: T f1
	; CHECK-OBJ-NOT: U f2			; CHECK-OBJ-NOT: U f2

				; Ensure we get expected error for input files without summaries
				; RUN: opt -o %t2.o %s
				; RUN: %clang -target x86_64-unknown-linux-gnu -O2 -o %t3.o -x ir %t1.o -c -fthinlto-index=%t.thinlto.bc
				; CHECK-ERROR2: Error loading imported file '{{.*}}': Could not find module summary

	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	declare void @f2()			declare void @f2()

	define void @f1() {			define void @f1() {
	call void @f2()			call void @f2()
	ret void			ret void
	}			}

llvm/include/llvm/Bitcode/BitcodeReader.h

Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	class BitcodeModule {
getModuleImpl(LLVMContext &Context, bool MaterializeAll,		getModuleImpl(LLVMContext &Context, bool MaterializeAll,
bool ShouldLazyLoadMetadata);		bool ShouldLazyLoadMetadata);

public:		public:
StringRef getBuffer() const {		StringRef getBuffer() const {
return StringRef((const char *)Buffer.begin(), Buffer.size());		return StringRef((const char *)Buffer.begin(), Buffer.size());
}		}

		StringRef getModuleIdentifier() const { return ModuleIdentifier; }

/// Read the bitcode module and prepare for lazy deserialization of function		/// Read the bitcode module and prepare for lazy deserialization of function
/// bodies. If ShouldLazyLoadMetadata is true, lazily load metadata as well.		/// bodies. If ShouldLazyLoadMetadata is true, lazily load metadata as well.
Expected<std::unique_ptr<Module>>		Expected<std::unique_ptr<Module>>
getLazyModule(LLVMContext &Context, bool ShouldLazyLoadMetadata);		getLazyModule(LLVMContext &Context, bool ShouldLazyLoadMetadata);

/// Read the entire bitcode module and return it.		/// Read the entire bitcode module and return it.
Expected<std::unique_ptr<Module>> parseModule(LLVMContext &Context);		Expected<std::unique_ptr<Module>> parseModule(LLVMContext &Context);

▲ Show 20 Lines • Show All 136 Lines • Show Last 20 Lines

llvm/include/llvm/LTO/LTO.h

Show All 25 Lines
#include "llvm/Linker/IRMover.h"		#include "llvm/Linker/IRMover.h"
#include "llvm/Object/IRObjectFile.h"		#include "llvm/Object/IRObjectFile.h"
#include "llvm/Support/thread.h"		#include "llvm/Support/thread.h"
#include "llvm/Target/TargetOptions.h"		#include "llvm/Target/TargetOptions.h"
#include "llvm/Transforms/IPO/FunctionImport.h"		#include "llvm/Transforms/IPO/FunctionImport.h"

namespace llvm {		namespace llvm {

		class BitcodeModule;
class Error;		class Error;
class LLVMContext;		class LLVMContext;
class MemoryBufferRef;		class MemoryBufferRef;
class Module;		class Module;
class Target;		class Target;
class raw_pwrite_stream;		class raw_pwrite_stream;

/// Resolve Weak and LinkOnce values in the \p Index. Linkage changes recorded		/// Resolve Weak and LinkOnce values in the \p Index. Linkage changes recorded
Show All 33 Lines
/// information that an LTO client should need in order to do symbol resolution.		/// information that an LTO client should need in order to do symbol resolution.
class InputFile {		class InputFile {
// FIXME: Remove LTO class friendship once we have bitcode symbol tables.		// FIXME: Remove LTO class friendship once we have bitcode symbol tables.
friend LTO;		friend LTO;
InputFile() = default;		InputFile() = default;

// FIXME: Remove the LLVMContext once we have bitcode symbol tables.		// FIXME: Remove the LLVMContext once we have bitcode symbol tables.
LLVMContext Ctx;		LLVMContext Ctx;
		struct InputModule;
		std::vector<InputModule> Mods;
ModuleSymbolTable SymTab;		ModuleSymbolTable SymTab;
std::unique_ptr<Module> Mod;
MemoryBufferRef MBRef;

std::vector<StringRef> Comdats;		std::vector<StringRef> Comdats;
DenseMap<const Comdat *, unsigned> ComdatMap;		DenseMap<const Comdat *, unsigned> ComdatMap;

public:		public:
		~InputFile();

/// Create an InputFile.		/// Create an InputFile.
static Expected<std::unique_ptr<InputFile>> create(MemoryBufferRef Object);		static Expected<std::unique_ptr<InputFile>> create(MemoryBufferRef Object);

class symbol_iterator;		class symbol_iterator;

/// This is a wrapper for ArrayRef<ModuleSymbolTable::Symbol>::iterator that		/// This is a wrapper for ArrayRef<ModuleSymbolTable::Symbol>::iterator that
/// exposes only the information that an LTO client should need in order to do		/// exposes only the information that an LTO client should need in order to do
/// symbol resolution.		/// symbol resolution.
▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	public:

/// A range over the symbols in this InputFile.		/// A range over the symbols in this InputFile.
iterator_range<symbol_iterator> symbols() {		iterator_range<symbol_iterator> symbols() {
return llvm::make_range(		return llvm::make_range(
symbol_iterator(SymTab.symbols().begin(), SymTab, this),		symbol_iterator(SymTab.symbols().begin(), SymTab, this),
symbol_iterator(SymTab.symbols().end(), SymTab, this));		symbol_iterator(SymTab.symbols().end(), SymTab, this));
}		}

StringRef getSourceFileName() const { return Mod->getSourceFileName(); }		/// Returns the path to the InputFile.
MemoryBufferRef getMemoryBufferRef() const { return MBRef; }		StringRef getName() const;
		mehdi_aminiUnsubmitted Done Reply Inline Actions Since we have the two, it seems worth documenting these. mehdi_amini: Since we have the two, it seems worth documenting these.

		/// Returns the source file path specified at compile time.
		StringRef getSourceFileName() const;

// Returns a table with all the comdats used by this file.		// Returns a table with all the comdats used by this file.
ArrayRef<StringRef> getComdatTable() const { return Comdats; }		ArrayRef<StringRef> getComdatTable() const { return Comdats; }

		private:
		iterator_range<symbol_iterator> module_symbols(InputModule &IM);
};		};

/// This class wraps an output stream for a native object. Most clients should		/// This class wraps an output stream for a native object. Most clients should
/// just be able to return an instance of this base class from the stream		/// just be able to return an instance of this base class from the stream
/// callback, but if a client needs to perform some action after the stream is		/// callback, but if a client needs to perform some action after the stream is
/// written to, that can be done by deriving from this class and overriding the		/// written to, that can be done by deriving from this class and overriding the
/// destructor.		/// destructor.
class NativeObjectStream {		class NativeObjectStream {
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
public:		public:
/// Create an LTO object. A default constructed LTO object has a reasonable		/// Create an LTO object. A default constructed LTO object has a reasonable
/// production configuration, but you can customize it by passing arguments to		/// production configuration, but you can customize it by passing arguments to
/// this constructor.		/// this constructor.
/// FIXME: We do currently require the DiagHandler field to be set in Conf.		/// FIXME: We do currently require the DiagHandler field to be set in Conf.
/// Until that is fixed, a Config argument is required.		/// Until that is fixed, a Config argument is required.
LTO(Config Conf, ThinBackend Backend = nullptr,		LTO(Config Conf, ThinBackend Backend = nullptr,
unsigned ParallelCodeGenParallelismLevel = 1);		unsigned ParallelCodeGenParallelismLevel = 1);
		~LTO();

/// Add an input file to the LTO link, using the provided symbol resolutions.		/// Add an input file to the LTO link, using the provided symbol resolutions.
/// The symbol resolutions must appear in the enumeration order given by		/// The symbol resolutions must appear in the enumeration order given by
/// InputFile::symbols().		/// InputFile::symbols().
Error add(std::unique_ptr<InputFile> Obj, ArrayRef<SymbolResolution> Res);		Error add(std::unique_ptr<InputFile> Obj, ArrayRef<SymbolResolution> Res);

/// Returns an upper bound on the number of tasks that the client may expect.		/// Returns an upper bound on the number of tasks that the client may expect.
/// This may only be called after all IR object files have been added. For a		/// This may only be called after all IR object files have been added. For a
Show All 30 Lines	struct RegularLTOState {
std::unique_ptr<IRMover> Mover;		std::unique_ptr<IRMover> Mover;
} RegularLTO;		} RegularLTO;

struct ThinLTOState {		struct ThinLTOState {
ThinLTOState(ThinBackend Backend);		ThinLTOState(ThinBackend Backend);

ThinBackend Backend;		ThinBackend Backend;
ModuleSummaryIndex CombinedIndex;		ModuleSummaryIndex CombinedIndex;
MapVector<StringRef, MemoryBufferRef> ModuleMap;		MapVector<StringRef, BitcodeModule> ModuleMap;
DenseMap<GlobalValue::GUID, StringRef> PrevailingModuleForGUID;		DenseMap<GlobalValue::GUID, StringRef> PrevailingModuleForGUID;
} ThinLTO;		} ThinLTO;

// The global resolution for a particular (mangled) symbol name. This is in		// The global resolution for a particular (mangled) symbol name. This is in
// particular necessary to track whether each symbol can be internalized.		// particular necessary to track whether each symbol can be internalized.
// Because any input file may introduce a new cross-partition reference, we		// Because any input file may introduce a new cross-partition reference, we
// cannot make any final internalization decisions until all input files have		// cannot make any final internalization decisions until all input files have
// been added and the client has called run(). During run() we apply		// been added and the client has called run(). During run() we apply
Show All 31 Lines	private:

// Global mapping from mangled symbol names to resolutions.		// Global mapping from mangled symbol names to resolutions.
StringMap<GlobalResolution> GlobalResolutions;		StringMap<GlobalResolution> GlobalResolutions;

void addSymbolToGlobalRes(SmallPtrSet<GlobalValue *, 8> &Used,		void addSymbolToGlobalRes(SmallPtrSet<GlobalValue *, 8> &Used,
const InputFile::Symbol &Sym, SymbolResolution Res,		const InputFile::Symbol &Sym, SymbolResolution Res,
unsigned Partition);		unsigned Partition);

Error addRegularLTO(std::unique_ptr<InputFile> Input,		// These functions take a range of symbol resolutions [ResI, ResE) and consume
ArrayRef<SymbolResolution> Res);		// the resolutions used by a single input module by incrementing ResI. After
Error addThinLTO(std::unique_ptr<InputFile> Input,		// these functions return, [ResI, ResE) will refer to the resolution range for
ArrayRef<SymbolResolution> Res);		// the remaining modules in the InputFile.
		Error addModule(InputFile &Input, InputFile::InputModule &IM,
		const SymbolResolution &ResI, const SymbolResolution ResE);
		Error addRegularLTO(BitcodeModule BM, const SymbolResolution *&ResI,
		mehdi_aminiUnsubmitted Done Reply Inline Actions This is a pretty annoying API with ResI as an InOut parameter: at least document it. mehdi_amini: This is a pretty annoying API with ResI as an InOut parameter: at least document it.
		const SymbolResolution *ResE);
		Error addThinLTO(BitcodeModule BM, Module &M,
		iterator_range<InputFile::symbol_iterator> Syms,
		const SymbolResolution &ResI, const SymbolResolution ResE);

Error runRegularLTO(AddStreamFn AddStream);		Error runRegularLTO(AddStreamFn AddStream);
Error runThinLTO(AddStreamFn AddStream, NativeObjectCache Cache,		Error runThinLTO(AddStreamFn AddStream, NativeObjectCache Cache,
bool HasRegularLTO);		bool HasRegularLTO);

mutable bool CalledGetMaxTasks = false;		mutable bool CalledGetMaxTasks = false;
};		};

Show All 21 Lines

llvm/include/llvm/LTO/LTOBackend.h

	Show All 21 Lines
	#include "llvm/IR/ModuleSummaryIndex.h"			#include "llvm/IR/ModuleSummaryIndex.h"
	#include "llvm/LTO/LTO.h"			#include "llvm/LTO/LTO.h"
	#include "llvm/Support/MemoryBuffer.h"			#include "llvm/Support/MemoryBuffer.h"
	#include "llvm/Target/TargetOptions.h"			#include "llvm/Target/TargetOptions.h"
	#include "llvm/Transforms/IPO/FunctionImport.h"			#include "llvm/Transforms/IPO/FunctionImport.h"

	namespace llvm {			namespace llvm {

				class BitcodeModule;
	class Error;			class Error;
	class Module;			class Module;
	class Target;			class Target;

	namespace lto {			namespace lto {

	/// Runs a regular LTO backend.			/// Runs a regular LTO backend.
	Error backend(Config &C, AddStreamFn AddStream,			Error backend(Config &C, AddStreamFn AddStream,
	unsigned ParallelCodeGenParallelismLevel,			unsigned ParallelCodeGenParallelismLevel,
	std::unique_ptr<Module> M);			std::unique_ptr<Module> M);

	/// Runs a ThinLTO backend.			/// Runs a ThinLTO backend.
	Error thinBackend(Config &C, unsigned Task, AddStreamFn AddStream, Module &M,			Error thinBackend(Config &C, unsigned Task, AddStreamFn AddStream, Module &M,
	ModuleSummaryIndex &CombinedIndex,			ModuleSummaryIndex &CombinedIndex,
	const FunctionImporter::ImportMapTy &ImportList,			const FunctionImporter::ImportMapTy &ImportList,
	const GVSummaryMapTy &DefinedGlobals,			const GVSummaryMapTy &DefinedGlobals,
	MapVector<StringRef, MemoryBufferRef> &ModuleMap);			MapVector<StringRef, BitcodeModule> &ModuleMap);
	}			}
	}			}

	#endif			#endif

llvm/lib/LTO/LTO.cpp

Show First 20 Lines • Show All 208 Lines • ▼ Show 20 Lines
// as external and non-exported values as internal.		// as external and non-exported values as internal.
void llvm::thinLTOInternalizeAndPromoteInIndex(		void llvm::thinLTOInternalizeAndPromoteInIndex(
ModuleSummaryIndex &Index,		ModuleSummaryIndex &Index,
function_ref<bool(StringRef, GlobalValue::GUID)> isExported) {		function_ref<bool(StringRef, GlobalValue::GUID)> isExported) {
for (auto &I : Index)		for (auto &I : Index)
thinLTOInternalizeAndPromoteGUID(I.second, I.first, isExported);		thinLTOInternalizeAndPromoteGUID(I.second, I.first, isExported);
}		}

		struct InputFile::InputModule {
		BitcodeModule BM;
		std::unique_ptr<Module> Mod;

		mehdi_aminiUnsubmitted Done Reply Inline Actions Document. mehdi_amini: Document.
		// The range of ModuleSymbolTable entries for this input module.
		size_t SymBegin, SymEnd;
		};
		mehdi_aminiUnsubmitted Done Reply Inline Actions Add a comment to explain why it is out-of-line. mehdi_amini: Add a comment to explain why it is out-of-line.

		// Requires a destructor for std::vector<InputModule>.
		InputFile::~InputFile() = default;

Expected<std::unique_ptr<InputFile>> InputFile::create(MemoryBufferRef Object) {		Expected<std::unique_ptr<InputFile>> InputFile::create(MemoryBufferRef Object) {
std::unique_ptr<InputFile> File(new InputFile);		std::unique_ptr<InputFile> File(new InputFile);

ErrorOr<MemoryBufferRef> BCOrErr =		ErrorOr<MemoryBufferRef> BCOrErr =
IRObjectFile::findBitcodeInMemBuffer(Object);		IRObjectFile::findBitcodeInMemBuffer(Object);
if (!BCOrErr)		if (!BCOrErr)
return errorCodeToError(BCOrErr.getError());		return errorCodeToError(BCOrErr.getError());
File->MBRef = *BCOrErr;

		Expected<std::vector<BitcodeModule>> BMsOrErr =
		getBitcodeModuleList(*BCOrErr);
		if (!BMsOrErr)
		return BMsOrErr.takeError();

		if (BMsOrErr->empty())
		return make_error<StringError>("Bitcode file does not contain any modules",
		inconvertibleErrorCode());
		mehdi_aminiUnsubmitted Done Reply Inline Actions Some high level one or two sentences commenting what is going to happen below in the loop would be welcome I think. mehdi_amini: Some high level one or two sentences commenting what is going to happen below in the loop would…

		// Create an InputModule for each module in the InputFile, and add it to the
		// ModuleSymbolTable.
		for (auto BM : *BMsOrErr) {
Expected<std::unique_ptr<Module>> MOrErr =		Expected<std::unique_ptr<Module>> MOrErr =
getLazyBitcodeModule(*BCOrErr, File->Ctx,		BM.getLazyModule(File->Ctx, /ShouldLazyLoadMetadata/ true);
/ShouldLazyLoadMetadata/ true);
if (!MOrErr)		if (!MOrErr)
return MOrErr.takeError();		return MOrErr.takeError();

File->Mod = std::move(*MOrErr);		size_t SymBegin = File->SymTab.symbols().size();
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Is there an expectation that no symbol can be duplicated in the various modules? mehdi_amini: Is there an expectation that no symbol can be duplicated in the various modules?
		pccAuthorUnsubmitted Not Done Reply Inline Actions If you mean that the modules may not both contain a definition of the symbol, then yes. The entity creating these bitcode files is expected to adhere to this policy. (The same applies to definitions of comdats, so the code on lines 215-221 below remains the same.) It's possible to have a defined symbol in one module and a duplicate undefined symbol with the same name in another module; the situation is similar to module inline asm right now (i.e. PR30396). Whatever solution we come up with for that problem should also apply to this one. pcc: If you mean that the modules may not both contain a definition of the symbol, then yes. The…
File->SymTab.addModule(File->Mod.get());		File->SymTab.addModule(MOrErr->get());
		size_t SymEnd = File->SymTab.symbols().size();
for (const auto &C : File->Mod->getComdatSymbolTable()) {
auto P =		for (const auto &C : (*MOrErr)->getComdatSymbolTable()) {
File->ComdatMap.insert(std::make_pair(&C.second, File->Comdats.size()));		auto P = File->ComdatMap.insert(
		std::make_pair(&C.second, File->Comdats.size()));
assert(P.second);		assert(P.second);
(void)P;		(void)P;
File->Comdats.push_back(C.first());		File->Comdats.push_back(C.first());
}		}

		File->Mods.push_back({BM, std::move(*MOrErr), SymBegin, SymEnd});
		}

return std::move(File);		return std::move(File);
}		}

Expected<int> InputFile::Symbol::getComdatIndex() const {		Expected<int> InputFile::Symbol::getComdatIndex() const {
if (!isGV())		if (!isGV())
return -1;		return -1;
const GlobalObject *GO = getGV()->getBaseObject();		const GlobalObject *GO = getGV()->getBaseObject();
if (!GO)		if (!GO)
return make_error<StringError>("Unable to determine comdat of alias!",		return make_error<StringError>("Unable to determine comdat of alias!",
inconvertibleErrorCode());		inconvertibleErrorCode());
if (const Comdat *C = GO->getComdat()) {		if (const Comdat *C = GO->getComdat()) {
auto I = File->ComdatMap.find(C);		auto I = File->ComdatMap.find(C);
assert(I != File->ComdatMap.end());		assert(I != File->ComdatMap.end());
return I->second;		return I->second;
}		}
return -1;		return -1;
}		}

		StringRef InputFile::getName() const {
		return Mods[0].BM.getModuleIdentifier();
		}

		StringRef InputFile::getSourceFileName() const {
		return Mods[0].Mod->getSourceFileName();
		}
		mehdi_aminiUnsubmitted Done Reply Inline Actions You can create an InputFile where we wouldn't find any module, if you don't want to support that we should detect it in the `create()` and return an error there. Otherwise this is UB to be called on a valid created `InputFile` mehdi_amini: You can create an InputFile where we wouldn't find any module, if you don't want to support…
		pccAuthorUnsubmitted Not Done Reply Inline Actions Added a check to `create()`. pcc: Added a check to `create()`.

		iterator_range<InputFile::symbol_iterator>
		InputFile::module_symbols(InputModule &IM) {
		return llvm::make_range(
		symbol_iterator(SymTab.symbols().data() + IM.SymBegin, SymTab, this),
		symbol_iterator(SymTab.symbols().data() + IM.SymEnd, SymTab, this));
		}

LTO::RegularLTOState::RegularLTOState(unsigned ParallelCodeGenParallelismLevel,		LTO::RegularLTOState::RegularLTOState(unsigned ParallelCodeGenParallelismLevel,
Config &Conf)		Config &Conf)
: ParallelCodeGenParallelismLevel(ParallelCodeGenParallelismLevel),		: ParallelCodeGenParallelismLevel(ParallelCodeGenParallelismLevel),
Ctx(Conf) {}		Ctx(Conf) {}

LTO::ThinLTOState::ThinLTOState(ThinBackend Backend) : Backend(Backend) {		LTO::ThinLTOState::ThinLTOState(ThinBackend Backend) : Backend(Backend) {
if (!Backend)		if (!Backend)
this->Backend =		this->Backend =
createInProcessThinBackend(llvm::heavyweight_hardware_concurrency());		createInProcessThinBackend(llvm::heavyweight_hardware_concurrency());
}		}

LTO::LTO(Config Conf, ThinBackend Backend,		LTO::LTO(Config Conf, ThinBackend Backend,
unsigned ParallelCodeGenParallelismLevel)		unsigned ParallelCodeGenParallelismLevel)
: Conf(std::move(Conf)),		: Conf(std::move(Conf)),
RegularLTO(ParallelCodeGenParallelismLevel, this->Conf),		RegularLTO(ParallelCodeGenParallelismLevel, this->Conf),
ThinLTO(std::move(Backend)) {}		ThinLTO(std::move(Backend)) {}

		// Requires a destructor for MapVector<BitcodeModule>.
		mehdi_aminiUnsubmitted Done Reply Inline Actions Add a comment to explain why it is out-of-line. mehdi_amini: Add a comment to explain why it is out-of-line.
		LTO::~LTO() = default;

// Add the given symbol to the GlobalResolutions map, and resolve its partition.		// Add the given symbol to the GlobalResolutions map, and resolve its partition.
void LTO::addSymbolToGlobalRes(SmallPtrSet<GlobalValue *, 8> &Used,		void LTO::addSymbolToGlobalRes(SmallPtrSet<GlobalValue *, 8> &Used,
const InputFile::Symbol &Sym,		const InputFile::Symbol &Sym,
SymbolResolution Res, unsigned Partition) {		SymbolResolution Res, unsigned Partition) {
GlobalValue *GV = Sym.isGV() ? Sym.getGV() : nullptr;		GlobalValue *GV = Sym.isGV() ? Sym.getGV() : nullptr;

auto &GlobalRes = GlobalResolutions[Sym.getName()];		auto &GlobalRes = GlobalResolutions[Sym.getName()];
if (GV) {		if (GV) {
GlobalRes.UnnamedAddr &= GV->hasGlobalUnnamedAddr();		GlobalRes.UnnamedAddr &= GV->hasGlobalUnnamedAddr();
if (Res.Prevailing)		if (Res.Prevailing)
GlobalRes.IRName = GV->getName();		GlobalRes.IRName = GV->getName();
}		}
if (Res.VisibleToRegularObj \|\| (GV && Used.count(GV)) \|\|		if (Res.VisibleToRegularObj \|\| (GV && Used.count(GV)) \|\|
(GlobalRes.Partition != GlobalResolution::Unknown &&		(GlobalRes.Partition != GlobalResolution::Unknown &&
GlobalRes.Partition != Partition))		GlobalRes.Partition != Partition))
GlobalRes.Partition = GlobalResolution::External;		GlobalRes.Partition = GlobalResolution::External;
else		else
GlobalRes.Partition = Partition;		GlobalRes.Partition = Partition;
}		}

static void writeToResolutionFile(raw_ostream &OS, InputFile *Input,		static void writeToResolutionFile(raw_ostream &OS, InputFile *Input,
ArrayRef<SymbolResolution> Res) {		ArrayRef<SymbolResolution> Res) {
StringRef Path = Input->getMemoryBufferRef().getBufferIdentifier();		StringRef Path = Input->getName();
OS << Path << '\n';		OS << Path << '\n';
auto ResI = Res.begin();		auto ResI = Res.begin();
for (const InputFile::Symbol &Sym : Input->symbols()) {		for (const InputFile::Symbol &Sym : Input->symbols()) {
assert(ResI != Res.end());		assert(ResI != Res.end());
SymbolResolution Res = *ResI++;		SymbolResolution Res = *ResI++;

OS << "-r=" << Path << ',' << Sym.getName() << ',';		OS << "-r=" << Path << ',' << Sym.getName() << ',';
if (Res.Prevailing)		if (Res.Prevailing)
Show All 9 Lines

Error LTO::add(std::unique_ptr<InputFile> Input,		Error LTO::add(std::unique_ptr<InputFile> Input,
ArrayRef<SymbolResolution> Res) {		ArrayRef<SymbolResolution> Res) {
assert(!CalledGetMaxTasks);		assert(!CalledGetMaxTasks);

if (Conf.ResolutionFile)		if (Conf.ResolutionFile)
writeToResolutionFile(*Conf.ResolutionFile, Input.get(), Res);		writeToResolutionFile(*Conf.ResolutionFile, Input.get(), Res);

		const SymbolResolution *ResI = Res.begin();
		for (InputFile::InputModule &IM : Input->Mods)
		if (Error Err = addModule(*Input, IM, ResI, Res.end()))
		return Err;

		assert(ResI == Res.end());
		return Error::success();
		}

		Error LTO::addModule(InputFile &Input, InputFile::InputModule &IM,
		const SymbolResolution *&ResI,
		const SymbolResolution *ResE) {
// FIXME: move to backend		// FIXME: move to backend
Module &M = *Input->Mod;		Module &M = *IM.Mod;
if (!Conf.OverrideTriple.empty())		if (!Conf.OverrideTriple.empty())
M.setTargetTriple(Conf.OverrideTriple);		M.setTargetTriple(Conf.OverrideTriple);
else if (M.getTargetTriple().empty())		else if (M.getTargetTriple().empty())
M.setTargetTriple(Conf.DefaultTriple);		M.setTargetTriple(Conf.DefaultTriple);

Expected<bool> HasThinLTOSummary = hasGlobalValueSummary(Input->MBRef);		Expected<bool> HasThinLTOSummary = IM.BM.hasSummary();
if (!HasThinLTOSummary)		if (!HasThinLTOSummary)
return HasThinLTOSummary.takeError();		return HasThinLTOSummary.takeError();

if (*HasThinLTOSummary)		if (*HasThinLTOSummary)
return addThinLTO(std::move(Input), Res);		return addThinLTO(IM.BM, M, Input.module_symbols(IM), ResI, ResE);
else		else
return addRegularLTO(std::move(Input), Res);		return addRegularLTO(IM.BM, ResI, ResE);
}		}

// Add a regular LTO object to the link.		// Add a regular LTO object to the link.
Error LTO::addRegularLTO(std::unique_ptr<InputFile> Input,		Error LTO::addRegularLTO(BitcodeModule BM, const SymbolResolution *&ResI,
ArrayRef<SymbolResolution> Res) {		const SymbolResolution *ResE) {
if (!RegularLTO.CombinedModule) {		if (!RegularLTO.CombinedModule) {
RegularLTO.CombinedModule =		RegularLTO.CombinedModule =
llvm::make_unique<Module>("ld-temp.o", RegularLTO.Ctx);		llvm::make_unique<Module>("ld-temp.o", RegularLTO.Ctx);
RegularLTO.Mover = llvm::make_unique<IRMover>(*RegularLTO.CombinedModule);		RegularLTO.Mover = llvm::make_unique<IRMover>(*RegularLTO.CombinedModule);
}		}
Expected<std::unique_ptr<Module>> MOrErr =		Expected<std::unique_ptr<Module>> MOrErr =
getLazyBitcodeModule(Input->MBRef, RegularLTO.Ctx,		BM.getLazyModule(RegularLTO.Ctx, /ShouldLazyLoadMetadata/ true);
/ShouldLazyLoadMetadata/ true);
if (!MOrErr)		if (!MOrErr)
return MOrErr.takeError();		return MOrErr.takeError();

Module &M = **MOrErr;		Module &M = **MOrErr;
if (Error Err = M.materializeMetadata())		if (Error Err = M.materializeMetadata())
return Err;		return Err;
UpgradeDebugInfo(M);		UpgradeDebugInfo(M);

ModuleSymbolTable SymTab;		ModuleSymbolTable SymTab;
SymTab.addModule(&M);		SymTab.addModule(&M);

SmallPtrSet<GlobalValue *, 8> Used;		SmallPtrSet<GlobalValue *, 8> Used;
collectUsedGlobalVariables(M, Used, /CompilerUsed/ false);		collectUsedGlobalVariables(M, Used, /CompilerUsed/ false);

std::vector<GlobalValue *> Keep;		std::vector<GlobalValue *> Keep;

for (GlobalVariable &GV : M.globals())		for (GlobalVariable &GV : M.globals())
if (GV.hasAppendingLinkage())		if (GV.hasAppendingLinkage())
Keep.push_back(&GV);		Keep.push_back(&GV);

auto ResI = Res.begin();
for (const InputFile::Symbol &Sym :		for (const InputFile::Symbol &Sym :
make_range(InputFile::symbol_iterator(SymTab.symbols().begin(), SymTab,		make_range(InputFile::symbol_iterator(SymTab.symbols().begin(), SymTab,
nullptr),		nullptr),
InputFile::symbol_iterator(SymTab.symbols().end(), SymTab,		InputFile::symbol_iterator(SymTab.symbols().end(), SymTab,
nullptr))) {		nullptr))) {
assert(ResI != Res.end());		assert(ResI != ResE);
SymbolResolution Res = *ResI++;		SymbolResolution Res = *ResI++;
addSymbolToGlobalRes(Used, Sym, Res, 0);		addSymbolToGlobalRes(Used, Sym, Res, 0);

if (Sym.getFlags() & object::BasicSymbolRef::SF_Undefined)		if (Sym.getFlags() & object::BasicSymbolRef::SF_Undefined)
continue;		continue;
if (Res.Prevailing && Sym.isGV()) {		if (Res.Prevailing && Sym.isGV()) {
GlobalValue *GV = Sym.getGV();		GlobalValue *GV = Sym.getGV();
Keep.push_back(GV);		Keep.push_back(GV);
Show All 17 Lines	if (Sym.getFlags() & object::BasicSymbolRef::SF_Common) {
auto &CommonRes = RegularLTO.Commons[Sym.getGV()->getName()];		auto &CommonRes = RegularLTO.Commons[Sym.getGV()->getName()];
CommonRes.Size = std::max(CommonRes.Size, Sym.getCommonSize());		CommonRes.Size = std::max(CommonRes.Size, Sym.getCommonSize());
CommonRes.Align = std::max(CommonRes.Align, Sym.getCommonAlignment());		CommonRes.Align = std::max(CommonRes.Align, Sym.getCommonAlignment());
CommonRes.Prevailing \|= Res.Prevailing;		CommonRes.Prevailing \|= Res.Prevailing;
}		}

// FIXME: use proposed local attribute for FinalDefinitionInLinkageUnit.		// FIXME: use proposed local attribute for FinalDefinitionInLinkageUnit.
}		}
assert(ResI == Res.end());

return RegularLTO.Mover->move(std::move(*MOrErr), Keep,		return RegularLTO.Mover->move(std::move(*MOrErr), Keep,
[](GlobalValue &, IRMover::ValueAdder) {},		[](GlobalValue &, IRMover::ValueAdder) {},
/* LinkModuleInlineAsm */ true,		/* LinkModuleInlineAsm */ true,
/* IsPerformingImport */ false);		/* IsPerformingImport */ false);
}		}

// Add a ThinLTO object to the link.		// Add a ThinLTO object to the link.
Error LTO::addThinLTO(std::unique_ptr<InputFile> Input,		// FIXME: This function should not need to take as many parameters once we have
ArrayRef<SymbolResolution> Res) {		// a bitcode symbol table.
Module &M = *Input->Mod;		Error LTO::addThinLTO(BitcodeModule BM, Module &M,
		iterator_range<InputFile::symbol_iterator> Syms,
		const SymbolResolution *&ResI,
		const SymbolResolution *ResE) {
SmallPtrSet<GlobalValue *, 8> Used;		SmallPtrSet<GlobalValue *, 8> Used;
collectUsedGlobalVariables(M, Used, /CompilerUsed/ false);		collectUsedGlobalVariables(M, Used, /CompilerUsed/ false);

MemoryBufferRef MBRef = Input->MBRef;		Expected<std::unique_ptr<ModuleSummaryIndex>> SummaryOrErr = BM.getSummary();
Expected<std::unique_ptr<object::ModuleSummaryIndexObjectFile>>		if (!SummaryOrErr)
SummaryObjOrErr = object::ModuleSummaryIndexObjectFile::create(MBRef);		return SummaryOrErr.takeError();
if (!SummaryObjOrErr)		ThinLTO.CombinedIndex.mergeFrom(std::move(*SummaryOrErr),
return SummaryObjOrErr.takeError();
ThinLTO.CombinedIndex.mergeFrom((*SummaryObjOrErr)->takeIndex(),
ThinLTO.ModuleMap.size());		ThinLTO.ModuleMap.size());

auto ResI = Res.begin();		for (const InputFile::Symbol &Sym : Syms) {
for (const InputFile::Symbol &Sym : Input->symbols()) {		assert(ResI != ResE);
assert(ResI != Res.end());
SymbolResolution Res = *ResI++;		SymbolResolution Res = *ResI++;
addSymbolToGlobalRes(Used, Sym, Res, ThinLTO.ModuleMap.size() + 1);		addSymbolToGlobalRes(Used, Sym, Res, ThinLTO.ModuleMap.size() + 1);

if (Res.Prevailing && Sym.isGV())		if (Res.Prevailing && Sym.isGV())
ThinLTO.PrevailingModuleForGUID[Sym.getGV()->getGUID()] =		ThinLTO.PrevailingModuleForGUID[Sym.getGV()->getGUID()] =
MBRef.getBufferIdentifier();		BM.getModuleIdentifier();
}		}
assert(ResI == Res.end());

ThinLTO.ModuleMap[MBRef.getBufferIdentifier()] = MBRef;		if (!ThinLTO.ModuleMap.insert({BM.getModuleIdentifier(), BM}).second)
		return make_error<StringError>(
		"Expected at most one ThinLTO module per bitcode file",
		inconvertibleErrorCode());

		mehdi_aminiUnsubmitted Done Reply Inline Actions We'll arrive here from `for (InputFile::InputModule &IM : Input->Mods)` in `LTO::add()`, how are we distinguishing `BM.getModuleIdentifier()` when there are multiple modules per file? I assume we don't expect multiple modules with a summary in the same input file, but in this case we should check here that the module has been inserted and error otherwise. mehdi_amini: We'll arrive here from ` for (InputFile::InputModule &IM : Input->Mods)` in `LTO::add()`, how…
		pccAuthorUnsubmitted Not Done Reply Inline Actions `BM.getModuleIdentifier()` will be the same for each `BitcodeModule` in the same input file by construction, so we can just test the return value of `insert` here. (I feel obliged to point out that this is yet another scenario where (path, byte slice) pairs would work better as a representation of ThinLTO modules.) pcc: `BM.getModuleIdentifier()` will be the same for each `BitcodeModule` in the same input file by…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions What is the other scenario? I don't remember what was specific to ThinLTO for this? Also a pair `path/byte slice` doesn't fit the API in general, does it? We don't necessarily have a "file". That said, instead of path the identifier could be a `<BufferID, offset>`, with `BufferID` intended to be a unique identifier provided by the client of the API per-buffer. I'm not sure how deep we'd have to thread this though. I think that as long as when we create a llvm::Module the identifier is consistent with whatever is used here, everything should be OK. mehdi_amini: What is the other scenario? I don't remember what was specific to ThinLTO for this? Also a…
		pccAuthorUnsubmitted Not Done Reply Inline Actions What is the other scenario? I don't remember what was specific to ThinLTO for this? The main other scenario is (non-thin) archive files, which is already a problem as we don't handle that correctly in the distributed case, and we have workarounds in both the gold plugin and lld to give `-save-temps` temporaries appropriate names. This is ThinLTO-specific because we don't create temporaries for each input file under regular LTO. Also a pair path/byte slice doesn't fit the API in general, does it? We don't necessarily have a "file". True, but it's also the case that we don't necessarily have a "file" now and we're already passing paths around. I imagine that clients which don't use files could just make something up for the offsets as they would make something up for file paths. That said, instead of path the identifier could be a <BufferID, offset>, with BufferID intended to be a unique identifier provided by the client of the API per-buffer. I think that could work as well. pcc: > What is the other scenario? I don't remember what was specific to ThinLTO for this? The main…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions In the C API (and thus in ld64), the expectation is that every input buffer has to be provided with a unique id. It happens that this ID is usually the path on file (with `libFoo.a(member.o)` when a static archive is involved). Because the linker is supplying one member at a time, the unique ID in the API is enough. Having multiple modules per file is new though and would break this "unique ID" provided by the linker, we could suffix it when loading each individual bitcode/module though. The --save-temps is always using integer right now I believe (files are 1.thin.o, 2.thin.o, etc.). True, but it's also the case that we don't necessarily have a "file" now and we're already passing paths around. That's just us being inconsistent, I tried to express "ModuleID" instead of path as much as possible. mehdi_amini: In the C API (and thus in ld64), the expectation is that every input buffer has to be provided…
		pccAuthorUnsubmitted Not Done Reply Inline Actions Because the linker is supplying one member at a time, the unique ID in the API is enough. I think @davide had a case where an archive had two members with the same name, and they were only distinguishable by offset. The --save-temps is always using integer right now I believe (files are 1.thin.o, 2.thin.o, etc.). Not always: http://llvm-cs.pcc.me.uk/lib/LTO/LTOBackend.cpp#77 gold and lld both pass `UseInputModulePath == true` here. We probably want to overhaul how the filenames look for `save-temps`, but I think in the end they should contain the module ID in some form. pcc: > Because the linker is supplying one member at a time, the unique ID in the API is enough. I…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I think @davide had a case where an archive had two members with the same name, and they were only distinguishable by offset I remember this, I wouldn't be too much concerned here with this and push to the client the responsibility to provide unique IDs. (Nothing prevent the linker to build an id that always include the offset for instance) mehdi_amini: > I think @davide had a case where an archive had two members with the same name, and they were…
		pccAuthorUnsubmitted Not Done Reply Inline Actions Well, in the distributed case some component needs to understand the module IDs in order to distribute the work properly. Part of that may involve translating the module IDs into "file paths" (which may be actual file paths or conceptual ones). If we can arrange to use the same ("file path", offset) scheme in all linkers, that component can be shared between linkers. But to a certain extent this is all hypothetical, I think I'd want to prototype before being sure that this is the right design. pcc: Well, in the distributed case some component needs to understand the module IDs in order to…
return Error::success();		return Error::success();
}		}

unsigned LTO::getMaxTasks() const {		unsigned LTO::getMaxTasks() const {
CalledGetMaxTasks = true;		CalledGetMaxTasks = true;
return RegularLTO.ParallelCodeGenParallelismLevel + ThinLTO.ModuleMap.size();		return RegularLTO.ParallelCodeGenParallelismLevel + ThinLTO.ModuleMap.size();
}		}

▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines
public:		public:
ThinBackendProc(Config &Conf, ModuleSummaryIndex &CombinedIndex,		ThinBackendProc(Config &Conf, ModuleSummaryIndex &CombinedIndex,
const StringMap<GVSummaryMapTy> &ModuleToDefinedGVSummaries)		const StringMap<GVSummaryMapTy> &ModuleToDefinedGVSummaries)
: Conf(Conf), CombinedIndex(CombinedIndex),		: Conf(Conf), CombinedIndex(CombinedIndex),
ModuleToDefinedGVSummaries(ModuleToDefinedGVSummaries) {}		ModuleToDefinedGVSummaries(ModuleToDefinedGVSummaries) {}

virtual ~ThinBackendProc() {}		virtual ~ThinBackendProc() {}
virtual Error start(		virtual Error start(
unsigned Task, MemoryBufferRef MBRef,		unsigned Task, BitcodeModule BM,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const FunctionImporter::ExportSetTy &ExportList,		const FunctionImporter::ExportSetTy &ExportList,
const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,		const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) = 0;		MapVector<StringRef, BitcodeModule> &ModuleMap) = 0;
virtual Error wait() = 0;		virtual Error wait() = 0;
};		};

namespace {		namespace {
class InProcessThinBackend : public ThinBackendProc {		class InProcessThinBackend : public ThinBackendProc {
ThreadPool BackendThreadPool;		ThreadPool BackendThreadPool;
AddStreamFn AddStream;		AddStreamFn AddStream;
NativeObjectCache Cache;		NativeObjectCache Cache;

Optional<Error> Err;		Optional<Error> Err;
std::mutex ErrMu;		std::mutex ErrMu;

public:		public:
InProcessThinBackend(		InProcessThinBackend(
Config &Conf, ModuleSummaryIndex &CombinedIndex,		Config &Conf, ModuleSummaryIndex &CombinedIndex,
unsigned ThinLTOParallelismLevel,		unsigned ThinLTOParallelismLevel,
const StringMap<GVSummaryMapTy> &ModuleToDefinedGVSummaries,		const StringMap<GVSummaryMapTy> &ModuleToDefinedGVSummaries,
AddStreamFn AddStream, NativeObjectCache Cache)		AddStreamFn AddStream, NativeObjectCache Cache)
: ThinBackendProc(Conf, CombinedIndex, ModuleToDefinedGVSummaries),		: ThinBackendProc(Conf, CombinedIndex, ModuleToDefinedGVSummaries),
BackendThreadPool(ThinLTOParallelismLevel),		BackendThreadPool(ThinLTOParallelismLevel),
AddStream(std::move(AddStream)), Cache(std::move(Cache)) {}		AddStream(std::move(AddStream)), Cache(std::move(Cache)) {}

Error runThinLTOBackendThread(		Error runThinLTOBackendThread(
AddStreamFn AddStream, NativeObjectCache Cache, unsigned Task,		AddStreamFn AddStream, NativeObjectCache Cache, unsigned Task,
MemoryBufferRef MBRef, ModuleSummaryIndex &CombinedIndex,		BitcodeModule BM, ModuleSummaryIndex &CombinedIndex,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const FunctionImporter::ExportSetTy &ExportList,		const FunctionImporter::ExportSetTy &ExportList,
const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,		const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,
const GVSummaryMapTy &DefinedGlobals,		const GVSummaryMapTy &DefinedGlobals,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) {		MapVector<StringRef, BitcodeModule> &ModuleMap) {
auto RunThinBackend = [&](AddStreamFn AddStream) {		auto RunThinBackend = [&](AddStreamFn AddStream) {
LTOLLVMContext BackendContext(Conf);		LTOLLVMContext BackendContext(Conf);
Expected<std::unique_ptr<Module>> MOrErr =		Expected<std::unique_ptr<Module>> MOrErr = BM.parseModule(BackendContext);
parseBitcodeFile(MBRef, BackendContext);
if (!MOrErr)		if (!MOrErr)
return MOrErr.takeError();		return MOrErr.takeError();

return thinBackend(Conf, Task, AddStream, **MOrErr, CombinedIndex,		return thinBackend(Conf, Task, AddStream, **MOrErr, CombinedIndex,
ImportList, DefinedGlobals, ModuleMap);		ImportList, DefinedGlobals, ModuleMap);
};		};

auto ModuleID = MBRef.getBufferIdentifier();		auto ModuleID = BM.getModuleIdentifier();

if (!Cache \|\| !CombinedIndex.modulePaths().count(ModuleID) \|\|		if (!Cache \|\| !CombinedIndex.modulePaths().count(ModuleID) \|\|
all_of(CombinedIndex.getModuleHash(ModuleID),		all_of(CombinedIndex.getModuleHash(ModuleID),
[](uint32_t V) { return V == 0; }))		[](uint32_t V) { return V == 0; }))
// Cache disabled or no entry for this module in the combined index or		// Cache disabled or no entry for this module in the combined index or
// no module hash.		// no module hash.
return RunThinBackend(AddStream);		return RunThinBackend(AddStream);

SmallString<40> Key;		SmallString<40> Key;
// The module may be cached, this helps handling it.		// The module may be cached, this helps handling it.
computeCacheKey(Key, Conf, CombinedIndex, ModuleID, ImportList, ExportList,		computeCacheKey(Key, Conf, CombinedIndex, ModuleID, ImportList, ExportList,
ResolvedODR, DefinedGlobals);		ResolvedODR, DefinedGlobals);
if (AddStreamFn CacheAddStream = Cache(Task, Key))		if (AddStreamFn CacheAddStream = Cache(Task, Key))
return RunThinBackend(CacheAddStream);		return RunThinBackend(CacheAddStream);

return Error::success();		return Error::success();
}		}

Error start(		Error start(
unsigned Task, MemoryBufferRef MBRef,		unsigned Task, BitcodeModule BM,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const FunctionImporter::ExportSetTy &ExportList,		const FunctionImporter::ExportSetTy &ExportList,
const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,		const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) override {		MapVector<StringRef, BitcodeModule> &ModuleMap) override {
StringRef ModulePath = MBRef.getBufferIdentifier();		StringRef ModulePath = BM.getModuleIdentifier();
assert(ModuleToDefinedGVSummaries.count(ModulePath));		assert(ModuleToDefinedGVSummaries.count(ModulePath));
const GVSummaryMapTy &DefinedGlobals =		const GVSummaryMapTy &DefinedGlobals =
ModuleToDefinedGVSummaries.find(ModulePath)->second;		ModuleToDefinedGVSummaries.find(ModulePath)->second;
BackendThreadPool.async(		BackendThreadPool.async(
[=](MemoryBufferRef MBRef, ModuleSummaryIndex &CombinedIndex,		[=](BitcodeModule BM, ModuleSummaryIndex &CombinedIndex,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const FunctionImporter::ExportSetTy &ExportList,		const FunctionImporter::ExportSetTy &ExportList,
const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes>		const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes>
&ResolvedODR,		&ResolvedODR,
const GVSummaryMapTy &DefinedGlobals,		const GVSummaryMapTy &DefinedGlobals,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) {		MapVector<StringRef, BitcodeModule> &ModuleMap) {
Error E = runThinLTOBackendThread(		Error E = runThinLTOBackendThread(
AddStream, Cache, Task, MBRef, CombinedIndex, ImportList,		AddStream, Cache, Task, BM, CombinedIndex, ImportList,
ExportList, ResolvedODR, DefinedGlobals, ModuleMap);		ExportList, ResolvedODR, DefinedGlobals, ModuleMap);
if (E) {		if (E) {
std::unique_lock<std::mutex> L(ErrMu);		std::unique_lock<std::mutex> L(ErrMu);
if (Err)		if (Err)
Err = joinErrors(std::move(*Err), std::move(E));		Err = joinErrors(std::move(*Err), std::move(E));
else		else
Err = std::move(E);		Err = std::move(E);
}		}
},		},
MBRef, std::ref(CombinedIndex), std::ref(ImportList),		BM, std::ref(CombinedIndex), std::ref(ImportList),
std::ref(ExportList), std::ref(ResolvedODR), std::ref(DefinedGlobals),		std::ref(ExportList), std::ref(ResolvedODR), std::ref(DefinedGlobals),
std::ref(ModuleMap));		std::ref(ModuleMap));
return Error::success();		return Error::success();
}		}

Error wait() override {		Error wait() override {
BackendThreadPool.wait();		BackendThreadPool.wait();
if (Err)		if (Err)
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	WriteIndexesThinBackend(
std::string OldPrefix, std::string NewPrefix, bool ShouldEmitImportsFiles,		std::string OldPrefix, std::string NewPrefix, bool ShouldEmitImportsFiles,
std::string LinkedObjectsFileName)		std::string LinkedObjectsFileName)
: ThinBackendProc(Conf, CombinedIndex, ModuleToDefinedGVSummaries),		: ThinBackendProc(Conf, CombinedIndex, ModuleToDefinedGVSummaries),
OldPrefix(OldPrefix), NewPrefix(NewPrefix),		OldPrefix(OldPrefix), NewPrefix(NewPrefix),
ShouldEmitImportsFiles(ShouldEmitImportsFiles),		ShouldEmitImportsFiles(ShouldEmitImportsFiles),
LinkedObjectsFileName(LinkedObjectsFileName) {}		LinkedObjectsFileName(LinkedObjectsFileName) {}

Error start(		Error start(
unsigned Task, MemoryBufferRef MBRef,		unsigned Task, BitcodeModule BM,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const FunctionImporter::ExportSetTy &ExportList,		const FunctionImporter::ExportSetTy &ExportList,
const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,		const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) override {		MapVector<StringRef, BitcodeModule> &ModuleMap) override {
StringRef ModulePath = MBRef.getBufferIdentifier();		StringRef ModulePath = BM.getModuleIdentifier();
std::string NewModulePath =		std::string NewModulePath =
getThinLTOOutputFile(ModulePath, OldPrefix, NewPrefix);		getThinLTOOutputFile(ModulePath, OldPrefix, NewPrefix);

std::error_code EC;		std::error_code EC;
if (!LinkedObjectsFileName.empty()) {		if (!LinkedObjectsFileName.empty()) {
if (!LinkedObjectsFile) {		if (!LinkedObjectsFile) {
LinkedObjectsFile = llvm::make_unique<raw_fd_ostream>(		LinkedObjectsFile = llvm::make_unique<raw_fd_ostream>(
LinkedObjectsFileName, EC, sys::fs::OpenFlags::F_None);		LinkedObjectsFileName, EC, sys::fs::OpenFlags::F_None);
▲ Show 20 Lines • Show All 128 Lines • Show Last 20 Lines

llvm/lib/LTO/LTOBackend.cpp

Show First 20 Lines • Show All 312 Lines • ▼ Show 20 Lines	Error lto::backend(Config &C, AddStreamFn AddStream,
}		}
return Error::success();		return Error::success();
}		}

Error lto::thinBackend(Config &Conf, unsigned Task, AddStreamFn AddStream,		Error lto::thinBackend(Config &Conf, unsigned Task, AddStreamFn AddStream,
Module &Mod, ModuleSummaryIndex &CombinedIndex,		Module &Mod, ModuleSummaryIndex &CombinedIndex,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const GVSummaryMapTy &DefinedGlobals,		const GVSummaryMapTy &DefinedGlobals,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) {		MapVector<StringRef, BitcodeModule> &ModuleMap) {
Expected<const Target *> TOrErr = initAndLookupTarget(Conf, Mod);		Expected<const Target *> TOrErr = initAndLookupTarget(Conf, Mod);
if (!TOrErr)		if (!TOrErr)
return TOrErr.takeError();		return TOrErr.takeError();

std::unique_ptr<TargetMachine> TM =		std::unique_ptr<TargetMachine> TM =
createTargetMachine(Conf, Mod.getTargetTriple(), *TOrErr);		createTargetMachine(Conf, Mod.getTargetTriple(), *TOrErr);

handleAsmUndefinedRefs(Mod, *TM);		handleAsmUndefinedRefs(Mod, *TM);
Show All 18 Lines	Error lto::thinBackend(Config &Conf, unsigned Task, AddStreamFn AddStream,

if (Conf.PostInternalizeModuleHook &&		if (Conf.PostInternalizeModuleHook &&
!Conf.PostInternalizeModuleHook(Task, Mod))		!Conf.PostInternalizeModuleHook(Task, Mod))
return Error::success();		return Error::success();

auto ModuleLoader = [&](StringRef Identifier) {		auto ModuleLoader = [&](StringRef Identifier) {
assert(Mod.getContext().isODRUniquingDebugTypes() &&		assert(Mod.getContext().isODRUniquingDebugTypes() &&
"ODR Type uniquing should be enabled on the context");		"ODR Type uniquing should be enabled on the context");
return getLazyBitcodeModule(ModuleMap[Identifier], Mod.getContext(),		auto I = ModuleMap.find(Identifier);
		assert(I != ModuleMap.end());
		return I->second.getLazyModule(Mod.getContext(),
/ShouldLazyLoadMetadata=/true);		/ShouldLazyLoadMetadata=/true);
};		};

FunctionImporter Importer(CombinedIndex, ModuleLoader);		FunctionImporter Importer(CombinedIndex, ModuleLoader);
if (Error Err = Importer.importFunctions(Mod, ImportList).takeError())		if (Error Err = Importer.importFunctions(Mod, ImportList).takeError())
return Err;		return Err;

if (Conf.PostImportModuleHook && !Conf.PostImportModuleHook(Task, Mod))		if (Conf.PostImportModuleHook && !Conf.PostImportModuleHook(Task, Mod))
return Error::success();		return Error::success();

if (!opt(Conf, TM.get(), Task, Mod, /IsThinLTO=/true))		if (!opt(Conf, TM.get(), Task, Mod, /IsThinLTO=/true))
return Error::success();		return Error::success();

codegen(Conf, TM.get(), AddStream, Task, Mod);		codegen(Conf, TM.get(), AddStream, Task, Mod);
return Error::success();		return Error::success();
}		}

llvm/test/LTO/Resolution/X86/empty-bitcode.test

This file was added.

				RUN: llvm-cat -o %t.o
				RUN: not llvm-lto2 -o %t2 %t.o 2>&1 \| FileCheck %s
				CHECK: Bitcode file does not contain any modules

llvm/test/LTO/Resolution/X86/mixed_lto.ll

	; Test mixed-mode LTO (mix of regular and thin LTO objects)			; Test mixed-mode LTO (mix of regular and thin LTO objects)
	; RUN: opt %s -o %t1.o			; RUN: opt %s -o %t1.o
	; RUN: opt -module-summary %p/Inputs/mixed_lto.ll -o %t2.o			; RUN: opt -module-summary %p/Inputs/mixed_lto.ll -o %t2.o

	; RUN: llvm-lto2 -o %t3.o %t2.o %t1.o -r %t2.o,main,px -r %t2.o,g, -r %t1.o,g,px			; RUN: llvm-lto2 -o %t3.o %t2.o %t1.o -r %t2.o,main,px -r %t2.o,g, -r %t1.o,g,px

	; Task 0 is the regular LTO file (this file)			; Task 0 is the regular LTO file (this file)
	; RUN: llvm-nm %t3.o.0 \| FileCheck %s --check-prefix=NM0			; RUN: llvm-nm %t3.o.0 \| FileCheck %s --check-prefix=NM0
	; NM0: T g			; NM0: T g

	; Task 1 is the (first) ThinLTO file (Inputs/mixed_lto.ll)			; Task 1 is the (first) ThinLTO file (Inputs/mixed_lto.ll)
	; RUN: llvm-nm %t3.o.1 \| FileCheck %s --check-prefix=NM1			; RUN: llvm-nm %t3.o.1 \| FileCheck %s --check-prefix=NM1
	; NM1-DAG: T main			; NM1-DAG: T main
	; NM1-DAG: U g			; NM1-DAG: U g

				; Do the same test again, but with the regular and thin LTO modules in the same file.
				; RUN: llvm-cat -b -o %t4.o %t2.o %t1.o
				; RUN: llvm-lto2 -o %t5.o %t4.o -r %t4.o,main,px -r %t4.o,g, -r %t4.o,g,px
				; RUN: llvm-nm %t5.o.0 \| FileCheck %s --check-prefix=NM0
				; RUN: llvm-nm %t5.o.1 \| FileCheck %s --check-prefix=NM1

	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"
	define i32 @g() {			define i32 @g() {
	ret i32 0			ret i32 0
	}			}

llvm/test/LTO/Resolution/X86/multi-thinlto.ll

This file was added.

				; RUN: opt -module-summary %s -o %t.o
				; RUN: llvm-cat -b -o %t2.o %t.o %t.o
				; RUN: not llvm-lto2 -o %t3.o %t2.o 2>&1 \| FileCheck %s
				; CHECK: Expected at most one ThinLTO module per bitcode file

				target triple = "x86_64-unknown-linux-gnu"

This is an archive of the discontinued LLVM Phabricator instance.

LTO: Add support for multi-module bitcode files.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 81322

clang/lib/CodeGen/BackendUtil.cpp

clang/test/CodeGen/thinlto_backend.ll

llvm/include/llvm/Bitcode/BitcodeReader.h

llvm/include/llvm/LTO/LTO.h

llvm/include/llvm/LTO/LTOBackend.h

llvm/lib/LTO/LTO.cpp

llvm/lib/LTO/LTOBackend.cpp

llvm/test/LTO/Resolution/X86/empty-bitcode.test

llvm/test/LTO/Resolution/X86/mixed_lto.ll

llvm/test/LTO/Resolution/X86/multi-thinlto.ll

LTO: Add support for multi-module bitcode files.
ClosedPublic