This is an archive of the discontinued LLVM Phabricator instance.

clang/lib/CodeGen/BackendUtil.cpp
771	The error message could be a bit more specific here.
775	We load the first BitcodeModule, should we assert that this is the only one? How do we know this is the right one?
llvm/include/llvm/LTO/LTO.h
224	Since we have the two, it seems worth documenting these.
421	This is a pretty annoying API with ResI as an InOut parameter: at least document it.
llvm/lib/LTO/LTO.cpp
187	Document.
190	Add a comment to explain why it is out-of-line.
200	Some high level one or two sentences commenting what is going to happen below in the loop would be welcome I think.
212	Is there an expectation that no symbol can be duplicated in the various modules?
250	You can create an InputFile where we wouldn't find any module, if you don't want to support that we should detect it in the `create()` and return an error there. Otherwise this is UB to be called on a valid created `InputFile`
276	Add a comment to explain why it is out-of-line.
456	We'll arrive here from `for (InputFile::InputModule &IM : Input->Mods)` in `LTO::add()`, how are we distinguishing `BM.getModuleIdentifier()` when there are multiple modules per file? I assume we don't expect multiple modules with a summary in the same input file, but in this case we should check here that the module has been inserted and error otherwise.

pcc marked 10 inline comments as done.Dec 13 2016, 4:03 PM

pcc added inline comments.

clang/lib/CodeGen/BackendUtil.cpp
775	I added some code that makes sure that we load the right module.
llvm/lib/LTO/LTO.cpp
212	If you mean that the modules may not both contain a definition of the symbol, then yes. The entity creating these bitcode files is expected to adhere to this policy. (The same applies to definitions of comdats, so the code on lines 215-221 below remains the same.) It's possible to have a defined symbol in one module and a duplicate undefined symbol with the same name in another module; the situation is similar to module inline asm right now (i.e. PR30396). Whatever solution we come up with for that problem should also apply to this one.
250	Added a check to `create()`.
456	`BM.getModuleIdentifier()` will be the same for each `BitcodeModule` in the same input file by construction, so we can just test the return value of `insert` here. (I feel obliged to point out that this is yet another scenario where (path, byte slice) pairs would work better as a representation of ThinLTO modules.)

Address review comments

mehdi_amini added inline comments.Dec 13 2016, 4:49 PM

clang/lib/CodeGen/BackendUtil.cpp
783	What if `I.first` was already present in the map? It seems the same case as below for ThinLTO where we should check that the returned value of `insert().second` .
llvm/lib/LTO/LTO.cpp
456	What is the other scenario? I don't remember what was specific to ThinLTO for this? Also a pair `path/byte slice` doesn't fit the API in general, does it? We don't necessarily have a "file". That said, instead of path the identifier could be a `<BufferID, offset>`, with `BufferID` intended to be a unique identifier provided by the client of the API per-buffer. I'm not sure how deep we'd have to thread this though. I think that as long as when we create a llvm::Module the identifier is consistent with whatever is used here, everything should be OK.

LGTM otherwise.

This revision is now accepted and ready to land.Dec 13 2016, 4:49 PM

pcc added inline comments.Dec 13 2016, 5:16 PM

clang/lib/CodeGen/BackendUtil.cpp
783	That doesn't seem like it can happen. The range for loop starting on line 758 enumerates elements of a map, so each `I.first()` will be unique. We also break out of this loop as soon as we see a module with a summary.
llvm/lib/LTO/LTO.cpp
456	What is the other scenario? I don't remember what was specific to ThinLTO for this? The main other scenario is (non-thin) archive files, which is already a problem as we don't handle that correctly in the distributed case, and we have workarounds in both the gold plugin and lld to give `-save-temps` temporaries appropriate names. This is ThinLTO-specific because we don't create temporaries for each input file under regular LTO. Also a pair path/byte slice doesn't fit the API in general, does it? We don't necessarily have a "file". True, but it's also the case that we don't necessarily have a "file" now and we're already passing paths around. I imagine that clients which don't use files could just make something up for the offsets as they would make something up for file paths. That said, instead of path the identifier could be a <BufferID, offset>, with BufferID intended to be a unique identifier provided by the client of the API per-buffer. I think that could work as well.

Closed by commit rL289621: LTO: Add support for multi-module bitcode files. (authored by pcc). · Explain WhyDec 13 2016, 5:28 PM

This revision was automatically updated to reflect the committed changes.

mehdi_amini added inline comments.Dec 13 2016, 5:37 PM

llvm/lib/LTO/LTO.cpp
456	In the C API (and thus in ld64), the expectation is that every input buffer has to be provided with a unique id. It happens that this ID is usually the path on file (with `libFoo.a(member.o)` when a static archive is involved). Because the linker is supplying one member at a time, the unique ID in the API is enough. Having multiple modules per file is new though and would break this "unique ID" provided by the linker, we could suffix it when loading each individual bitcode/module though. The --save-temps is always using integer right now I believe (files are 1.thin.o, 2.thin.o, etc.). True, but it's also the case that we don't necessarily have a "file" now and we're already passing paths around. That's just us being inconsistent, I tried to express "ModuleID" instead of path as much as possible.

pcc added a subscriber: davide.Dec 13 2016, 5:51 PM

pcc added inline comments.

llvm/lib/LTO/LTO.cpp
456	Because the linker is supplying one member at a time, the unique ID in the API is enough. I think @davide had a case where an archive had two members with the same name, and they were only distinguishable by offset. The --save-temps is always using integer right now I believe (files are 1.thin.o, 2.thin.o, etc.). Not always: http://llvm-cs.pcc.me.uk/lib/LTO/LTOBackend.cpp#77 gold and lld both pass `UseInputModulePath == true` here. We probably want to overhaul how the filenames look for `save-temps`, but I think in the end they should contain the module ID in some form.

Yes, thin-archivecollision.ll in lld/test/lto is an example, but it could be even worse because the two members can be part of the same archive. I find it weird that's possible to create an archive with two distinct members and the same name, but apparently ar consider this a legitimate operation.

mehdi_amini added inline comments.Dec 13 2016, 6:00 PM

llvm/lib/LTO/LTO.cpp
456	I think @davide had a case where an archive had two members with the same name, and they were only distinguishable by offset I remember this, I wouldn't be too much concerned here with this and push to the client the responsibility to provide unique IDs. (Nothing prevent the linker to build an id that always include the offset for instance)

pcc added inline comments.Dec 13 2016, 6:16 PM

llvm/lib/LTO/LTO.cpp
456	Well, in the distributed case some component needs to understand the module IDs in order to distribute the work properly. Part of that may involve translating the module IDs into "file paths" (which may be actual file paths or conceptual ones). If we can arrange to use the same ("file path", offset) scheme in all linkers, that component can be shared between linkers. But to a certain extent this is all hypothetical, I think I'd want to prototype before being sure that this is the right design.

Revision Contents

Path

Size

clang/

lib/

CodeGen/

BackendUtil.cpp

13 lines

llvm/

include/

llvm/

Bitcode/

BitcodeReader.h

2 lines

LTO/

LTO.h

28 lines

LTOBackend.h

3 lines

lib/

LTO/

LTO.cpp

158 lines

LTOBackend.cpp

8 lines

test/

LTO/

Resolution/

X86/

mixed_lto.ll

6 lines

Diff 79959

clang/lib/CodeGen/BackendUtil.cpp

Show First 20 Lines • Show All 747 Lines • ▼ Show 20 Lines	static void runThinLTOBackend(const CodeGenOptions &CGOpts, Module *M,

// FIXME: We could simply import the modules mentioned in the combined index		// FIXME: We could simply import the modules mentioned in the combined index
// here.		// here.
FunctionImporter::ImportMapTy ImportList;		FunctionImporter::ImportMapTy ImportList;
ComputeCrossModuleImportForModule(M->getModuleIdentifier(), *CombinedIndex,		ComputeCrossModuleImportForModule(M->getModuleIdentifier(), *CombinedIndex,
ImportList);		ImportList);

std::vector<std::unique_ptr<llvm::MemoryBuffer>> OwnedImports;		std::vector<std::unique_ptr<llvm::MemoryBuffer>> OwnedImports;
MapVector<llvm::StringRef, llvm::MemoryBufferRef> ModuleMap;		MapVector<llvm::StringRef, llvm::BitcodeModule> ModuleMap;

for (auto &I : ImportList) {		for (auto &I : ImportList) {
ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> MBOrErr =		ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> MBOrErr =
llvm::MemoryBuffer::getFile(I.first());		llvm::MemoryBuffer::getFile(I.first());
if (!MBOrErr) {		if (!MBOrErr) {
errs() << "Error loading imported file '" << I.first()		errs() << "Error loading imported file '" << I.first()
<< "': " << MBOrErr.getError().message() << "\n";		<< "': " << MBOrErr.getError().message() << "\n";
return;		return;
}		}
ModuleMap[I.first()] = (*MBOrErr)->getMemBufferRef();
		Expected<std::vector<BitcodeModule>> BMsOrErr =
		getBitcodeModuleList(**MBOrErr);
		if (!BMsOrErr) {
		handleAllErrors(BMsOrErr.takeError(), [&](ErrorInfoBase &EIB) {
		errs() << "Error running ThinLTO backend: " << EIB.message() << '\n';
		mehdi_aminiUnsubmitted Done Reply Inline Actions The error message could be a bit more specific here. mehdi_amini: The error message could be a bit more specific here.
		});
		return;
		}
		ModuleMap.insert({I.first(), (*BMsOrErr)[0]});
		mehdi_aminiUnsubmitted Done Reply Inline Actions We load the first BitcodeModule, should we assert that this is the only one? How do we know this is the right one? mehdi_amini: We load the first BitcodeModule, should we assert that this is the only one? How do we know…
		pccAuthorUnsubmitted Not Done Reply Inline Actions I added some code that makes sure that we load the right module. pcc: I added some code that makes sure that we load the right module.
OwnedImports.push_back(std::move(*MBOrErr));		OwnedImports.push_back(std::move(*MBOrErr));
}		}
auto AddStream = [&](size_t Task) {		auto AddStream = [&](size_t Task) {
return llvm::make_unique<lto::NativeObjectStream>(std::move(OS));		return llvm::make_unique<lto::NativeObjectStream>(std::move(OS));
};		};
lto::Config Conf;		lto::Config Conf;
if (Error E = thinBackend(		if (Error E = thinBackend(
Conf, 0, AddStream, M, CombinedIndex, ImportList,		Conf, 0, AddStream, M, CombinedIndex, ImportList,
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions What if `I.first` was already present in the map? It seems the same case as below for ThinLTO where we should check that the returned value of `insert().second` . mehdi_amini: What if `I.first` was already present in the map? It seems the same case as below for ThinLTO…
		pccAuthorUnsubmitted Not Done Reply Inline Actions That doesn't seem like it can happen. The range for loop starting on line 758 enumerates elements of a map, so each `I.first()` will be unique. We also break out of this loop as soon as we see a module with a summary. pcc: That doesn't seem like it can happen. The range for loop starting on line 758 enumerates…
ModuleToDefinedGVSummaries[M->getModuleIdentifier()], ModuleMap)) {		ModuleToDefinedGVSummaries[M->getModuleIdentifier()], ModuleMap)) {
handleAllErrors(std::move(E), [&](ErrorInfoBase &EIB) {		handleAllErrors(std::move(E), [&](ErrorInfoBase &EIB) {
errs() << "Error running ThinLTO backend: " << EIB.message() << '\n';		errs() << "Error running ThinLTO backend: " << EIB.message() << '\n';
});		});
}		}
}		}

void clang::EmitBackendOutput(DiagnosticsEngine &Diags,		void clang::EmitBackendOutput(DiagnosticsEngine &Diags,
▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

llvm/include/llvm/Bitcode/BitcodeReader.h

Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	class BitcodeModule {
getModuleImpl(LLVMContext &Context, bool MaterializeAll,		getModuleImpl(LLVMContext &Context, bool MaterializeAll,
bool ShouldLazyLoadMetadata);		bool ShouldLazyLoadMetadata);

public:		public:
StringRef getBuffer() const {		StringRef getBuffer() const {
return StringRef((const char *)Buffer.begin(), Buffer.size());		return StringRef((const char *)Buffer.begin(), Buffer.size());
}		}

		StringRef getModuleIdentifier() const { return ModuleIdentifier; }

/// Read the bitcode module and prepare for lazy deserialization of function		/// Read the bitcode module and prepare for lazy deserialization of function
/// bodies. If ShouldLazyLoadMetadata is true, lazily load metadata as well.		/// bodies. If ShouldLazyLoadMetadata is true, lazily load metadata as well.
Expected<std::unique_ptr<Module>>		Expected<std::unique_ptr<Module>>
getLazyModule(LLVMContext &Context, bool ShouldLazyLoadMetadata);		getLazyModule(LLVMContext &Context, bool ShouldLazyLoadMetadata);

/// Read the entire bitcode module and return it.		/// Read the entire bitcode module and return it.
Expected<std::unique_ptr<Module>> parseModule(LLVMContext &Context);		Expected<std::unique_ptr<Module>> parseModule(LLVMContext &Context);

▲ Show 20 Lines • Show All 136 Lines • Show Last 20 Lines

llvm/include/llvm/LTO/LTO.h

Show All 25 Lines
#include "llvm/Linker/IRMover.h"		#include "llvm/Linker/IRMover.h"
#include "llvm/Object/IRObjectFile.h"		#include "llvm/Object/IRObjectFile.h"
#include "llvm/Support/thread.h"		#include "llvm/Support/thread.h"
#include "llvm/Target/TargetOptions.h"		#include "llvm/Target/TargetOptions.h"
#include "llvm/Transforms/IPO/FunctionImport.h"		#include "llvm/Transforms/IPO/FunctionImport.h"

namespace llvm {		namespace llvm {

		class BitcodeModule;
class Error;		class Error;
class LLVMContext;		class LLVMContext;
class MemoryBufferRef;		class MemoryBufferRef;
class Module;		class Module;
class Target;		class Target;
class raw_pwrite_stream;		class raw_pwrite_stream;

/// Resolve Weak and LinkOnce values in the \p Index. Linkage changes recorded		/// Resolve Weak and LinkOnce values in the \p Index. Linkage changes recorded
Show All 33 Lines
/// information that an LTO client should need in order to do symbol resolution.		/// information that an LTO client should need in order to do symbol resolution.
class InputFile {		class InputFile {
// FIXME: Remove LTO class friendship once we have bitcode symbol tables.		// FIXME: Remove LTO class friendship once we have bitcode symbol tables.
friend LTO;		friend LTO;
InputFile() = default;		InputFile() = default;

// FIXME: Remove the LLVMContext once we have bitcode symbol tables.		// FIXME: Remove the LLVMContext once we have bitcode symbol tables.
LLVMContext Ctx;		LLVMContext Ctx;
		struct InputModule;
		std::vector<InputModule> Mods;
ModuleSymbolTable SymTab;		ModuleSymbolTable SymTab;
std::unique_ptr<Module> Mod;
MemoryBufferRef MBRef;

std::vector<StringRef> Comdats;		std::vector<StringRef> Comdats;
DenseMap<const Comdat *, unsigned> ComdatMap;		DenseMap<const Comdat *, unsigned> ComdatMap;

public:		public:
		~InputFile();

/// Create an InputFile.		/// Create an InputFile.
static Expected<std::unique_ptr<InputFile>> create(MemoryBufferRef Object);		static Expected<std::unique_ptr<InputFile>> create(MemoryBufferRef Object);

class symbol_iterator;		class symbol_iterator;

/// This is a wrapper for ArrayRef<ModuleSymbolTable::Symbol>::iterator that		/// This is a wrapper for ArrayRef<ModuleSymbolTable::Symbol>::iterator that
/// exposes only the information that an LTO client should need in order to do		/// exposes only the information that an LTO client should need in order to do
/// symbol resolution.		/// symbol resolution.
▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	public:

/// A range over the symbols in this InputFile.		/// A range over the symbols in this InputFile.
iterator_range<symbol_iterator> symbols() {		iterator_range<symbol_iterator> symbols() {
return llvm::make_range(		return llvm::make_range(
symbol_iterator(SymTab.symbols().begin(), SymTab, this),		symbol_iterator(SymTab.symbols().begin(), SymTab, this),
symbol_iterator(SymTab.symbols().end(), SymTab, this));		symbol_iterator(SymTab.symbols().end(), SymTab, this));
}		}

StringRef getSourceFileName() const { return Mod->getSourceFileName(); }		StringRef getName() const;
MemoryBufferRef getMemoryBufferRef() const { return MBRef; }		StringRef getSourceFileName() const;
		mehdi_aminiUnsubmitted Done Reply Inline Actions Since we have the two, it seems worth documenting these. mehdi_amini: Since we have the two, it seems worth documenting these.

// Returns a table with all the comdats used by this file.		// Returns a table with all the comdats used by this file.
ArrayRef<StringRef> getComdatTable() const { return Comdats; }		ArrayRef<StringRef> getComdatTable() const { return Comdats; }

		private:
		iterator_range<symbol_iterator> module_symbols(InputModule &IM);
};		};

/// This class wraps an output stream for a native object. Most clients should		/// This class wraps an output stream for a native object. Most clients should
/// just be able to return an instance of this base class from the stream		/// just be able to return an instance of this base class from the stream
/// callback, but if a client needs to perform some action after the stream is		/// callback, but if a client needs to perform some action after the stream is
/// written to, that can be done by deriving from this class and overriding the		/// written to, that can be done by deriving from this class and overriding the
/// destructor.		/// destructor.
class NativeObjectStream {		class NativeObjectStream {
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
public:		public:
/// Create an LTO object. A default constructed LTO object has a reasonable		/// Create an LTO object. A default constructed LTO object has a reasonable
/// production configuration, but you can customize it by passing arguments to		/// production configuration, but you can customize it by passing arguments to
/// this constructor.		/// this constructor.
/// FIXME: We do currently require the DiagHandler field to be set in Conf.		/// FIXME: We do currently require the DiagHandler field to be set in Conf.
/// Until that is fixed, a Config argument is required.		/// Until that is fixed, a Config argument is required.
LTO(Config Conf, ThinBackend Backend = nullptr,		LTO(Config Conf, ThinBackend Backend = nullptr,
unsigned ParallelCodeGenParallelismLevel = 1);		unsigned ParallelCodeGenParallelismLevel = 1);
		~LTO();

/// Add an input file to the LTO link, using the provided symbol resolutions.		/// Add an input file to the LTO link, using the provided symbol resolutions.
/// The symbol resolutions must appear in the enumeration order given by		/// The symbol resolutions must appear in the enumeration order given by
/// InputFile::symbols().		/// InputFile::symbols().
Error add(std::unique_ptr<InputFile> Obj, ArrayRef<SymbolResolution> Res);		Error add(std::unique_ptr<InputFile> Obj, ArrayRef<SymbolResolution> Res);

/// Returns an upper bound on the number of tasks that the client may expect.		/// Returns an upper bound on the number of tasks that the client may expect.
/// This may only be called after all IR object files have been added. For a		/// This may only be called after all IR object files have been added. For a
Show All 30 Lines	struct RegularLTOState {
std::unique_ptr<IRMover> Mover;		std::unique_ptr<IRMover> Mover;
} RegularLTO;		} RegularLTO;

struct ThinLTOState {		struct ThinLTOState {
ThinLTOState(ThinBackend Backend);		ThinLTOState(ThinBackend Backend);

ThinBackend Backend;		ThinBackend Backend;
ModuleSummaryIndex CombinedIndex;		ModuleSummaryIndex CombinedIndex;
MapVector<StringRef, MemoryBufferRef> ModuleMap;		MapVector<StringRef, BitcodeModule> ModuleMap;
DenseMap<GlobalValue::GUID, StringRef> PrevailingModuleForGUID;		DenseMap<GlobalValue::GUID, StringRef> PrevailingModuleForGUID;
} ThinLTO;		} ThinLTO;

// The global resolution for a particular (mangled) symbol name. This is in		// The global resolution for a particular (mangled) symbol name. This is in
// particular necessary to track whether each symbol can be internalized.		// particular necessary to track whether each symbol can be internalized.
// Because any input file may introduce a new cross-partition reference, we		// Because any input file may introduce a new cross-partition reference, we
// cannot make any final internalization decisions until all input files have		// cannot make any final internalization decisions until all input files have
// been added and the client has called run(). During run() we apply		// been added and the client has called run(). During run() we apply
Show All 31 Lines	private:

// Global mapping from mangled symbol names to resolutions.		// Global mapping from mangled symbol names to resolutions.
StringMap<GlobalResolution> GlobalResolutions;		StringMap<GlobalResolution> GlobalResolutions;

void addSymbolToGlobalRes(SmallPtrSet<GlobalValue *, 8> &Used,		void addSymbolToGlobalRes(SmallPtrSet<GlobalValue *, 8> &Used,
const InputFile::Symbol &Sym, SymbolResolution Res,		const InputFile::Symbol &Sym, SymbolResolution Res,
unsigned Partition);		unsigned Partition);

Error addRegularLTO(std::unique_ptr<InputFile> Input,		Error addModule(InputFile &Input, InputFile::InputModule &IM,
ArrayRef<SymbolResolution> Res);		const SymbolResolution &ResI, const SymbolResolution ResE);
Error addThinLTO(std::unique_ptr<InputFile> Input,		Error addRegularLTO(BitcodeModule BM, const SymbolResolution *&ResI,
ArrayRef<SymbolResolution> Res);		const SymbolResolution *ResE);
		Error addThinLTO(BitcodeModule BM, Module &M,
		iterator_range<InputFile::symbol_iterator> Syms,
		const SymbolResolution &ResI, const SymbolResolution ResE);
		mehdi_aminiUnsubmitted Done Reply Inline Actions This is a pretty annoying API with ResI as an InOut parameter: at least document it. mehdi_amini: This is a pretty annoying API with ResI as an InOut parameter: at least document it.

Error runRegularLTO(AddStreamFn AddStream);		Error runRegularLTO(AddStreamFn AddStream);
Error runThinLTO(AddStreamFn AddStream, NativeObjectCache Cache,		Error runThinLTO(AddStreamFn AddStream, NativeObjectCache Cache,
bool HasRegularLTO);		bool HasRegularLTO);

mutable bool CalledGetMaxTasks = false;		mutable bool CalledGetMaxTasks = false;
};		};

Show All 21 Lines

llvm/include/llvm/LTO/LTOBackend.h

	Show All 21 Lines
	#include "llvm/IR/ModuleSummaryIndex.h"			#include "llvm/IR/ModuleSummaryIndex.h"
	#include "llvm/LTO/LTO.h"			#include "llvm/LTO/LTO.h"
	#include "llvm/Support/MemoryBuffer.h"			#include "llvm/Support/MemoryBuffer.h"
	#include "llvm/Target/TargetOptions.h"			#include "llvm/Target/TargetOptions.h"
	#include "llvm/Transforms/IPO/FunctionImport.h"			#include "llvm/Transforms/IPO/FunctionImport.h"

	namespace llvm {			namespace llvm {

				class BitcodeModule;
	class Error;			class Error;
	class Module;			class Module;
	class Target;			class Target;

	namespace lto {			namespace lto {

	/// Runs a regular LTO backend.			/// Runs a regular LTO backend.
	Error backend(Config &C, AddStreamFn AddStream,			Error backend(Config &C, AddStreamFn AddStream,
	unsigned ParallelCodeGenParallelismLevel,			unsigned ParallelCodeGenParallelismLevel,
	std::unique_ptr<Module> M);			std::unique_ptr<Module> M);

	/// Runs a ThinLTO backend.			/// Runs a ThinLTO backend.
	Error thinBackend(Config &C, unsigned Task, AddStreamFn AddStream, Module &M,			Error thinBackend(Config &C, unsigned Task, AddStreamFn AddStream, Module &M,
	ModuleSummaryIndex &CombinedIndex,			ModuleSummaryIndex &CombinedIndex,
	const FunctionImporter::ImportMapTy &ImportList,			const FunctionImporter::ImportMapTy &ImportList,
	const GVSummaryMapTy &DefinedGlobals,			const GVSummaryMapTy &DefinedGlobals,
	MapVector<StringRef, MemoryBufferRef> &ModuleMap);			MapVector<StringRef, BitcodeModule> &ModuleMap);
	}			}
	}			}

	#endif			#endif

llvm/lib/LTO/LTO.cpp

Show First 20 Lines • Show All 175 Lines • ▼ Show 20 Lines
// as external and non-exported values as internal.		// as external and non-exported values as internal.
void llvm::thinLTOInternalizeAndPromoteInIndex(		void llvm::thinLTOInternalizeAndPromoteInIndex(
ModuleSummaryIndex &Index,		ModuleSummaryIndex &Index,
function_ref<bool(StringRef, GlobalValue::GUID)> isExported) {		function_ref<bool(StringRef, GlobalValue::GUID)> isExported) {
for (auto &I : Index)		for (auto &I : Index)
thinLTOInternalizeAndPromoteGUID(I.second, I.first, isExported);		thinLTOInternalizeAndPromoteGUID(I.second, I.first, isExported);
}		}

		struct InputFile::InputModule {
		BitcodeModule BM;
		std::unique_ptr<Module> Mod;
		size_t SymBegin, SymEnd;
		mehdi_aminiUnsubmitted Done Reply Inline Actions Document. mehdi_amini: Document.
		};

		InputFile::~InputFile() = default;
		mehdi_aminiUnsubmitted Done Reply Inline Actions Add a comment to explain why it is out-of-line. mehdi_amini: Add a comment to explain why it is out-of-line.

Expected<std::unique_ptr<InputFile>> InputFile::create(MemoryBufferRef Object) {		Expected<std::unique_ptr<InputFile>> InputFile::create(MemoryBufferRef Object) {
std::unique_ptr<InputFile> File(new InputFile);		std::unique_ptr<InputFile> File(new InputFile);

ErrorOr<MemoryBufferRef> BCOrErr =		ErrorOr<MemoryBufferRef> BCOrErr =
IRObjectFile::findBitcodeInMemBuffer(Object);		IRObjectFile::findBitcodeInMemBuffer(Object);
if (!BCOrErr)		if (!BCOrErr)
return errorCodeToError(BCOrErr.getError());		return errorCodeToError(BCOrErr.getError());
File->MBRef = *BCOrErr;

		Expected<std::vector<BitcodeModule>> BMsOrErr =
		mehdi_aminiUnsubmitted Done Reply Inline Actions Some high level one or two sentences commenting what is going to happen below in the loop would be welcome I think. mehdi_amini: Some high level one or two sentences commenting what is going to happen below in the loop would…
		getBitcodeModuleList(*BCOrErr);
		if (!BMsOrErr)
		return BMsOrErr.takeError();

		for (auto BM : *BMsOrErr) {
Expected<std::unique_ptr<Module>> MOrErr =		Expected<std::unique_ptr<Module>> MOrErr =
getLazyBitcodeModule(*BCOrErr, File->Ctx,		BM.getLazyModule(File->Ctx, /ShouldLazyLoadMetadata/ true);
/ShouldLazyLoadMetadata/ true);
if (!MOrErr)		if (!MOrErr)
return MOrErr.takeError();		return MOrErr.takeError();

File->Mod = std::move(*MOrErr);		size_t SymBegin = File->SymTab.symbols().size();
File->SymTab.addModule(File->Mod.get());		File->SymTab.addModule(MOrErr->get());
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Is there an expectation that no symbol can be duplicated in the various modules? mehdi_amini: Is there an expectation that no symbol can be duplicated in the various modules?
		pccAuthorUnsubmitted Not Done Reply Inline Actions If you mean that the modules may not both contain a definition of the symbol, then yes. The entity creating these bitcode files is expected to adhere to this policy. (The same applies to definitions of comdats, so the code on lines 215-221 below remains the same.) It's possible to have a defined symbol in one module and a duplicate undefined symbol with the same name in another module; the situation is similar to module inline asm right now (i.e. PR30396). Whatever solution we come up with for that problem should also apply to this one. pcc: If you mean that the modules may not both contain a definition of the symbol, then yes. The…
		size_t SymEnd = File->SymTab.symbols().size();
for (const auto &C : File->Mod->getComdatSymbolTable()) {
auto P =		for (const auto &C : (*MOrErr)->getComdatSymbolTable()) {
File->ComdatMap.insert(std::make_pair(&C.second, File->Comdats.size()));		auto P = File->ComdatMap.insert(
		std::make_pair(&C.second, File->Comdats.size()));
assert(P.second);		assert(P.second);
(void)P;		(void)P;
File->Comdats.push_back(C.first());		File->Comdats.push_back(C.first());
}		}

		File->Mods.push_back({BM, std::move(*MOrErr), SymBegin, SymEnd});
		}

return std::move(File);		return std::move(File);
}		}

Expected<int> InputFile::Symbol::getComdatIndex() const {		Expected<int> InputFile::Symbol::getComdatIndex() const {
if (!isGV())		if (!isGV())
return -1;		return -1;
const GlobalObject *GO = getGV()->getBaseObject();		const GlobalObject *GO = getGV()->getBaseObject();
if (!GO)		if (!GO)
return make_error<StringError>("Unable to determine comdat of alias!",		return make_error<StringError>("Unable to determine comdat of alias!",
inconvertibleErrorCode());		inconvertibleErrorCode());
if (const Comdat *C = GO->getComdat()) {		if (const Comdat *C = GO->getComdat()) {
auto I = File->ComdatMap.find(C);		auto I = File->ComdatMap.find(C);
assert(I != File->ComdatMap.end());		assert(I != File->ComdatMap.end());
return I->second;		return I->second;
}		}
return -1;		return -1;
}		}

		StringRef InputFile::getName() const {
		return Mods[0].BM.getModuleIdentifier();
		}

		StringRef InputFile::getSourceFileName() const {
		return Mods[0].Mod->getSourceFileName();
		}
		mehdi_aminiUnsubmitted Done Reply Inline Actions You can create an InputFile where we wouldn't find any module, if you don't want to support that we should detect it in the `create()` and return an error there. Otherwise this is UB to be called on a valid created `InputFile` mehdi_amini: You can create an InputFile where we wouldn't find any module, if you don't want to support…
		pccAuthorUnsubmitted Not Done Reply Inline Actions Added a check to `create()`. pcc: Added a check to `create()`.

		iterator_range<InputFile::symbol_iterator>
		InputFile::module_symbols(InputModule &IM) {
		return llvm::make_range(
		symbol_iterator(SymTab.symbols().data() + IM.SymBegin, SymTab, this),
		symbol_iterator(SymTab.symbols().data() + IM.SymEnd, SymTab, this));
		}

LTO::RegularLTOState::RegularLTOState(unsigned ParallelCodeGenParallelismLevel,		LTO::RegularLTOState::RegularLTOState(unsigned ParallelCodeGenParallelismLevel,
Config &Conf)		Config &Conf)
: ParallelCodeGenParallelismLevel(ParallelCodeGenParallelismLevel),		: ParallelCodeGenParallelismLevel(ParallelCodeGenParallelismLevel),
Ctx(Conf) {}		Ctx(Conf) {}

LTO::ThinLTOState::ThinLTOState(ThinBackend Backend) : Backend(Backend) {		LTO::ThinLTOState::ThinLTOState(ThinBackend Backend) : Backend(Backend) {
if (!Backend)		if (!Backend)
this->Backend =		this->Backend =
createInProcessThinBackend(llvm::heavyweight_hardware_concurrency());		createInProcessThinBackend(llvm::heavyweight_hardware_concurrency());
}		}

LTO::LTO(Config Conf, ThinBackend Backend,		LTO::LTO(Config Conf, ThinBackend Backend,
unsigned ParallelCodeGenParallelismLevel)		unsigned ParallelCodeGenParallelismLevel)
: Conf(std::move(Conf)),		: Conf(std::move(Conf)),
RegularLTO(ParallelCodeGenParallelismLevel, this->Conf),		RegularLTO(ParallelCodeGenParallelismLevel, this->Conf),
ThinLTO(std::move(Backend)) {}		ThinLTO(std::move(Backend)) {}

		LTO::~LTO() = default;
		mehdi_aminiUnsubmitted Done Reply Inline Actions Add a comment to explain why it is out-of-line. mehdi_amini: Add a comment to explain why it is out-of-line.

// Add the given symbol to the GlobalResolutions map, and resolve its partition.		// Add the given symbol to the GlobalResolutions map, and resolve its partition.
void LTO::addSymbolToGlobalRes(SmallPtrSet<GlobalValue *, 8> &Used,		void LTO::addSymbolToGlobalRes(SmallPtrSet<GlobalValue *, 8> &Used,
const InputFile::Symbol &Sym,		const InputFile::Symbol &Sym,
SymbolResolution Res, unsigned Partition) {		SymbolResolution Res, unsigned Partition) {
GlobalValue *GV = Sym.isGV() ? Sym.getGV() : nullptr;		GlobalValue *GV = Sym.isGV() ? Sym.getGV() : nullptr;

auto &GlobalRes = GlobalResolutions[Sym.getName()];		auto &GlobalRes = GlobalResolutions[Sym.getName()];
if (GV) {		if (GV) {
GlobalRes.UnnamedAddr &= GV->hasGlobalUnnamedAddr();		GlobalRes.UnnamedAddr &= GV->hasGlobalUnnamedAddr();
if (Res.Prevailing)		if (Res.Prevailing)
GlobalRes.IRName = GV->getName();		GlobalRes.IRName = GV->getName();
}		}
if (Res.VisibleToRegularObj \|\| (GV && Used.count(GV)) \|\|		if (Res.VisibleToRegularObj \|\| (GV && Used.count(GV)) \|\|
(GlobalRes.Partition != GlobalResolution::Unknown &&		(GlobalRes.Partition != GlobalResolution::Unknown &&
GlobalRes.Partition != Partition))		GlobalRes.Partition != Partition))
GlobalRes.Partition = GlobalResolution::External;		GlobalRes.Partition = GlobalResolution::External;
else		else
GlobalRes.Partition = Partition;		GlobalRes.Partition = Partition;
}		}

static void writeToResolutionFile(raw_ostream &OS, InputFile *Input,		static void writeToResolutionFile(raw_ostream &OS, InputFile *Input,
ArrayRef<SymbolResolution> Res) {		ArrayRef<SymbolResolution> Res) {
StringRef Path = Input->getMemoryBufferRef().getBufferIdentifier();		StringRef Path = Input->getName();
OS << Path << '\n';		OS << Path << '\n';
auto ResI = Res.begin();		auto ResI = Res.begin();
for (const InputFile::Symbol &Sym : Input->symbols()) {		for (const InputFile::Symbol &Sym : Input->symbols()) {
assert(ResI != Res.end());		assert(ResI != Res.end());
SymbolResolution Res = *ResI++;		SymbolResolution Res = *ResI++;

OS << "-r=" << Path << ',' << Sym.getName() << ',';		OS << "-r=" << Path << ',' << Sym.getName() << ',';
if (Res.Prevailing)		if (Res.Prevailing)
Show All 9 Lines

Error LTO::add(std::unique_ptr<InputFile> Input,		Error LTO::add(std::unique_ptr<InputFile> Input,
ArrayRef<SymbolResolution> Res) {		ArrayRef<SymbolResolution> Res) {
assert(!CalledGetMaxTasks);		assert(!CalledGetMaxTasks);

if (Conf.ResolutionFile)		if (Conf.ResolutionFile)
writeToResolutionFile(*Conf.ResolutionFile, Input.get(), Res);		writeToResolutionFile(*Conf.ResolutionFile, Input.get(), Res);

		const SymbolResolution *ResI = Res.begin();
		for (InputFile::InputModule &IM : Input->Mods)
		if (Error Err = addModule(*Input, IM, ResI, Res.end()))
		return Err;

		assert(ResI == Res.end());
		return Error::success();
		}

		Error LTO::addModule(InputFile &Input, InputFile::InputModule &IM,
		const SymbolResolution *&ResI,
		const SymbolResolution *ResE) {
// FIXME: move to backend		// FIXME: move to backend
Module &M = *Input->Mod;		Module &M = *IM.Mod;
if (!Conf.OverrideTriple.empty())		if (!Conf.OverrideTriple.empty())
M.setTargetTriple(Conf.OverrideTriple);		M.setTargetTriple(Conf.OverrideTriple);
else if (M.getTargetTriple().empty())		else if (M.getTargetTriple().empty())
M.setTargetTriple(Conf.DefaultTriple);		M.setTargetTriple(Conf.DefaultTriple);

Expected<bool> HasThinLTOSummary = hasGlobalValueSummary(Input->MBRef);		Expected<bool> HasThinLTOSummary = IM.BM.hasSummary();
if (!HasThinLTOSummary)		if (!HasThinLTOSummary)
return HasThinLTOSummary.takeError();		return HasThinLTOSummary.takeError();

if (*HasThinLTOSummary)		if (*HasThinLTOSummary)
return addThinLTO(std::move(Input), Res);		return addThinLTO(IM.BM, M, Input.module_symbols(IM), ResI, ResE);
else		else
return addRegularLTO(std::move(Input), Res);		return addRegularLTO(IM.BM, ResI, ResE);
}		}

// Add a regular LTO object to the link.		// Add a regular LTO object to the link.
Error LTO::addRegularLTO(std::unique_ptr<InputFile> Input,		Error LTO::addRegularLTO(BitcodeModule BM, const SymbolResolution *&ResI,
ArrayRef<SymbolResolution> Res) {		const SymbolResolution *ResE) {
if (!RegularLTO.CombinedModule) {		if (!RegularLTO.CombinedModule) {
RegularLTO.CombinedModule =		RegularLTO.CombinedModule =
llvm::make_unique<Module>("ld-temp.o", RegularLTO.Ctx);		llvm::make_unique<Module>("ld-temp.o", RegularLTO.Ctx);
RegularLTO.Mover = llvm::make_unique<IRMover>(*RegularLTO.CombinedModule);		RegularLTO.Mover = llvm::make_unique<IRMover>(*RegularLTO.CombinedModule);
}		}
Expected<std::unique_ptr<Module>> MOrErr =		Expected<std::unique_ptr<Module>> MOrErr =
getLazyBitcodeModule(Input->MBRef, RegularLTO.Ctx,		BM.getLazyModule(RegularLTO.Ctx, /ShouldLazyLoadMetadata/ true);
/ShouldLazyLoadMetadata/ true);
if (!MOrErr)		if (!MOrErr)
return MOrErr.takeError();		return MOrErr.takeError();

Module &M = **MOrErr;		Module &M = **MOrErr;
if (Error Err = M.materializeMetadata())		if (Error Err = M.materializeMetadata())
return Err;		return Err;
UpgradeDebugInfo(M);		UpgradeDebugInfo(M);

ModuleSymbolTable SymTab;		ModuleSymbolTable SymTab;
SymTab.addModule(&M);		SymTab.addModule(&M);

SmallPtrSet<GlobalValue *, 8> Used;		SmallPtrSet<GlobalValue *, 8> Used;
collectUsedGlobalVariables(M, Used, /CompilerUsed/ false);		collectUsedGlobalVariables(M, Used, /CompilerUsed/ false);

std::vector<GlobalValue *> Keep;		std::vector<GlobalValue *> Keep;

for (GlobalVariable &GV : M.globals())		for (GlobalVariable &GV : M.globals())
if (GV.hasAppendingLinkage())		if (GV.hasAppendingLinkage())
Keep.push_back(&GV);		Keep.push_back(&GV);

auto ResI = Res.begin();
for (const InputFile::Symbol &Sym :		for (const InputFile::Symbol &Sym :
make_range(InputFile::symbol_iterator(SymTab.symbols().begin(), SymTab,		make_range(InputFile::symbol_iterator(SymTab.symbols().begin(), SymTab,
nullptr),		nullptr),
InputFile::symbol_iterator(SymTab.symbols().end(), SymTab,		InputFile::symbol_iterator(SymTab.symbols().end(), SymTab,
nullptr))) {		nullptr))) {
assert(ResI != Res.end());		assert(ResI != ResE);
SymbolResolution Res = *ResI++;		SymbolResolution Res = *ResI++;
addSymbolToGlobalRes(Used, Sym, Res, 0);		addSymbolToGlobalRes(Used, Sym, Res, 0);

if (Sym.getFlags() & object::BasicSymbolRef::SF_Undefined)		if (Sym.getFlags() & object::BasicSymbolRef::SF_Undefined)
continue;		continue;
if (Res.Prevailing && Sym.isGV()) {		if (Res.Prevailing && Sym.isGV()) {
GlobalValue *GV = Sym.getGV();		GlobalValue *GV = Sym.getGV();
Keep.push_back(GV);		Keep.push_back(GV);
Show All 17 Lines	if (Sym.getFlags() & object::BasicSymbolRef::SF_Common) {
auto &CommonRes = RegularLTO.Commons[Sym.getGV()->getName()];		auto &CommonRes = RegularLTO.Commons[Sym.getGV()->getName()];
CommonRes.Size = std::max(CommonRes.Size, Sym.getCommonSize());		CommonRes.Size = std::max(CommonRes.Size, Sym.getCommonSize());
CommonRes.Align = std::max(CommonRes.Align, Sym.getCommonAlignment());		CommonRes.Align = std::max(CommonRes.Align, Sym.getCommonAlignment());
CommonRes.Prevailing \|= Res.Prevailing;		CommonRes.Prevailing \|= Res.Prevailing;
}		}

// FIXME: use proposed local attribute for FinalDefinitionInLinkageUnit.		// FIXME: use proposed local attribute for FinalDefinitionInLinkageUnit.
}		}
assert(ResI == Res.end());

return RegularLTO.Mover->move(std::move(*MOrErr), Keep,		return RegularLTO.Mover->move(std::move(*MOrErr), Keep,
[](GlobalValue &, IRMover::ValueAdder) {},		[](GlobalValue &, IRMover::ValueAdder) {},
/* LinkModuleInlineAsm */ true);		/* LinkModuleInlineAsm */ true);
}		}

// Add a ThinLTO object to the link.		// Add a ThinLTO object to the link.
Error LTO::addThinLTO(std::unique_ptr<InputFile> Input,		// FIXME: This function should not need to take as many parameters once we have
ArrayRef<SymbolResolution> Res) {		// a bitcode symbol table.
Module &M = *Input->Mod;		Error LTO::addThinLTO(BitcodeModule BM, Module &M,
		iterator_range<InputFile::symbol_iterator> Syms,
		const SymbolResolution *&ResI,
		const SymbolResolution *ResE) {
SmallPtrSet<GlobalValue *, 8> Used;		SmallPtrSet<GlobalValue *, 8> Used;
collectUsedGlobalVariables(M, Used, /CompilerUsed/ false);		collectUsedGlobalVariables(M, Used, /CompilerUsed/ false);

MemoryBufferRef MBRef = Input->MBRef;		Expected<std::unique_ptr<ModuleSummaryIndex>> SummaryOrErr = BM.getSummary();
Expected<std::unique_ptr<object::ModuleSummaryIndexObjectFile>>		if (!SummaryOrErr)
SummaryObjOrErr = object::ModuleSummaryIndexObjectFile::create(MBRef);		return SummaryOrErr.takeError();
if (!SummaryObjOrErr)		ThinLTO.CombinedIndex.mergeFrom(std::move(*SummaryOrErr),
return SummaryObjOrErr.takeError();
ThinLTO.CombinedIndex.mergeFrom((*SummaryObjOrErr)->takeIndex(),
ThinLTO.ModuleMap.size());		ThinLTO.ModuleMap.size());

auto ResI = Res.begin();		for (const InputFile::Symbol &Sym : Syms) {
for (const InputFile::Symbol &Sym : Input->symbols()) {		assert(ResI != ResE);
assert(ResI != Res.end());
SymbolResolution Res = *ResI++;		SymbolResolution Res = *ResI++;
addSymbolToGlobalRes(Used, Sym, Res, ThinLTO.ModuleMap.size() + 1);		addSymbolToGlobalRes(Used, Sym, Res, ThinLTO.ModuleMap.size() + 1);

if (Res.Prevailing && Sym.isGV())		if (Res.Prevailing && Sym.isGV())
ThinLTO.PrevailingModuleForGUID[Sym.getGV()->getGUID()] =		ThinLTO.PrevailingModuleForGUID[Sym.getGV()->getGUID()] =
MBRef.getBufferIdentifier();		BM.getModuleIdentifier();
}		}
assert(ResI == Res.end());

ThinLTO.ModuleMap[MBRef.getBufferIdentifier()] = MBRef;		ThinLTO.ModuleMap.insert({BM.getModuleIdentifier(), BM});
		mehdi_aminiUnsubmitted Done Reply Inline Actions We'll arrive here from `for (InputFile::InputModule &IM : Input->Mods)` in `LTO::add()`, how are we distinguishing `BM.getModuleIdentifier()` when there are multiple modules per file? I assume we don't expect multiple modules with a summary in the same input file, but in this case we should check here that the module has been inserted and error otherwise. mehdi_amini: We'll arrive here from ` for (InputFile::InputModule &IM : Input->Mods)` in `LTO::add()`, how…
		pccAuthorUnsubmitted Not Done Reply Inline Actions `BM.getModuleIdentifier()` will be the same for each `BitcodeModule` in the same input file by construction, so we can just test the return value of `insert` here. (I feel obliged to point out that this is yet another scenario where (path, byte slice) pairs would work better as a representation of ThinLTO modules.) pcc: `BM.getModuleIdentifier()` will be the same for each `BitcodeModule` in the same input file by…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions What is the other scenario? I don't remember what was specific to ThinLTO for this? Also a pair `path/byte slice` doesn't fit the API in general, does it? We don't necessarily have a "file". That said, instead of path the identifier could be a `<BufferID, offset>`, with `BufferID` intended to be a unique identifier provided by the client of the API per-buffer. I'm not sure how deep we'd have to thread this though. I think that as long as when we create a llvm::Module the identifier is consistent with whatever is used here, everything should be OK. mehdi_amini: What is the other scenario? I don't remember what was specific to ThinLTO for this? Also a…
		pccAuthorUnsubmitted Not Done Reply Inline Actions What is the other scenario? I don't remember what was specific to ThinLTO for this? The main other scenario is (non-thin) archive files, which is already a problem as we don't handle that correctly in the distributed case, and we have workarounds in both the gold plugin and lld to give `-save-temps` temporaries appropriate names. This is ThinLTO-specific because we don't create temporaries for each input file under regular LTO. Also a pair path/byte slice doesn't fit the API in general, does it? We don't necessarily have a "file". True, but it's also the case that we don't necessarily have a "file" now and we're already passing paths around. I imagine that clients which don't use files could just make something up for the offsets as they would make something up for file paths. That said, instead of path the identifier could be a <BufferID, offset>, with BufferID intended to be a unique identifier provided by the client of the API per-buffer. I think that could work as well. pcc: > What is the other scenario? I don't remember what was specific to ThinLTO for this? The main…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions In the C API (and thus in ld64), the expectation is that every input buffer has to be provided with a unique id. It happens that this ID is usually the path on file (with `libFoo.a(member.o)` when a static archive is involved). Because the linker is supplying one member at a time, the unique ID in the API is enough. Having multiple modules per file is new though and would break this "unique ID" provided by the linker, we could suffix it when loading each individual bitcode/module though. The --save-temps is always using integer right now I believe (files are 1.thin.o, 2.thin.o, etc.). True, but it's also the case that we don't necessarily have a "file" now and we're already passing paths around. That's just us being inconsistent, I tried to express "ModuleID" instead of path as much as possible. mehdi_amini: In the C API (and thus in ld64), the expectation is that every input buffer has to be provided…
		pccAuthorUnsubmitted Not Done Reply Inline Actions Because the linker is supplying one member at a time, the unique ID in the API is enough. I think @davide had a case where an archive had two members with the same name, and they were only distinguishable by offset. The --save-temps is always using integer right now I believe (files are 1.thin.o, 2.thin.o, etc.). Not always: http://llvm-cs.pcc.me.uk/lib/LTO/LTOBackend.cpp#77 gold and lld both pass `UseInputModulePath == true` here. We probably want to overhaul how the filenames look for `save-temps`, but I think in the end they should contain the module ID in some form. pcc: > Because the linker is supplying one member at a time, the unique ID in the API is enough. I…
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions I think @davide had a case where an archive had two members with the same name, and they were only distinguishable by offset I remember this, I wouldn't be too much concerned here with this and push to the client the responsibility to provide unique IDs. (Nothing prevent the linker to build an id that always include the offset for instance) mehdi_amini: > I think @davide had a case where an archive had two members with the same name, and they were…
		pccAuthorUnsubmitted Not Done Reply Inline Actions Well, in the distributed case some component needs to understand the module IDs in order to distribute the work properly. Part of that may involve translating the module IDs into "file paths" (which may be actual file paths or conceptual ones). If we can arrange to use the same ("file path", offset) scheme in all linkers, that component can be shared between linkers. But to a certain extent this is all hypothetical, I think I'd want to prototype before being sure that this is the right design. pcc: Well, in the distributed case some component needs to understand the module IDs in order to…
return Error::success();		return Error::success();
}		}

unsigned LTO::getMaxTasks() const {		unsigned LTO::getMaxTasks() const {
CalledGetMaxTasks = true;		CalledGetMaxTasks = true;
return RegularLTO.ParallelCodeGenParallelismLevel + ThinLTO.ModuleMap.size();		return RegularLTO.ParallelCodeGenParallelismLevel + ThinLTO.ModuleMap.size();
}		}

▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines
public:		public:
ThinBackendProc(Config &Conf, ModuleSummaryIndex &CombinedIndex,		ThinBackendProc(Config &Conf, ModuleSummaryIndex &CombinedIndex,
const StringMap<GVSummaryMapTy> &ModuleToDefinedGVSummaries)		const StringMap<GVSummaryMapTy> &ModuleToDefinedGVSummaries)
: Conf(Conf), CombinedIndex(CombinedIndex),		: Conf(Conf), CombinedIndex(CombinedIndex),
ModuleToDefinedGVSummaries(ModuleToDefinedGVSummaries) {}		ModuleToDefinedGVSummaries(ModuleToDefinedGVSummaries) {}

virtual ~ThinBackendProc() {}		virtual ~ThinBackendProc() {}
virtual Error start(		virtual Error start(
unsigned Task, MemoryBufferRef MBRef,		unsigned Task, BitcodeModule BM,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const FunctionImporter::ExportSetTy &ExportList,		const FunctionImporter::ExportSetTy &ExportList,
const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,		const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) = 0;		MapVector<StringRef, BitcodeModule> &ModuleMap) = 0;
virtual Error wait() = 0;		virtual Error wait() = 0;
};		};

namespace {		namespace {
class InProcessThinBackend : public ThinBackendProc {		class InProcessThinBackend : public ThinBackendProc {
ThreadPool BackendThreadPool;		ThreadPool BackendThreadPool;
AddStreamFn AddStream;		AddStreamFn AddStream;
NativeObjectCache Cache;		NativeObjectCache Cache;

Optional<Error> Err;		Optional<Error> Err;
std::mutex ErrMu;		std::mutex ErrMu;

public:		public:
InProcessThinBackend(		InProcessThinBackend(
Config &Conf, ModuleSummaryIndex &CombinedIndex,		Config &Conf, ModuleSummaryIndex &CombinedIndex,
unsigned ThinLTOParallelismLevel,		unsigned ThinLTOParallelismLevel,
const StringMap<GVSummaryMapTy> &ModuleToDefinedGVSummaries,		const StringMap<GVSummaryMapTy> &ModuleToDefinedGVSummaries,
AddStreamFn AddStream, NativeObjectCache Cache)		AddStreamFn AddStream, NativeObjectCache Cache)
: ThinBackendProc(Conf, CombinedIndex, ModuleToDefinedGVSummaries),		: ThinBackendProc(Conf, CombinedIndex, ModuleToDefinedGVSummaries),
BackendThreadPool(ThinLTOParallelismLevel),		BackendThreadPool(ThinLTOParallelismLevel),
AddStream(std::move(AddStream)), Cache(std::move(Cache)) {}		AddStream(std::move(AddStream)), Cache(std::move(Cache)) {}

Error runThinLTOBackendThread(		Error runThinLTOBackendThread(
AddStreamFn AddStream, NativeObjectCache Cache, unsigned Task,		AddStreamFn AddStream, NativeObjectCache Cache, unsigned Task,
MemoryBufferRef MBRef, ModuleSummaryIndex &CombinedIndex,		BitcodeModule BM, ModuleSummaryIndex &CombinedIndex,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const FunctionImporter::ExportSetTy &ExportList,		const FunctionImporter::ExportSetTy &ExportList,
const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,		const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,
const GVSummaryMapTy &DefinedGlobals,		const GVSummaryMapTy &DefinedGlobals,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) {		MapVector<StringRef, BitcodeModule> &ModuleMap) {
auto RunThinBackend = [&](AddStreamFn AddStream) {		auto RunThinBackend = [&](AddStreamFn AddStream) {
LTOLLVMContext BackendContext(Conf);		LTOLLVMContext BackendContext(Conf);
Expected<std::unique_ptr<Module>> MOrErr =		Expected<std::unique_ptr<Module>> MOrErr = BM.parseModule(BackendContext);
parseBitcodeFile(MBRef, BackendContext);
if (!MOrErr)		if (!MOrErr)
return MOrErr.takeError();		return MOrErr.takeError();

return thinBackend(Conf, Task, AddStream, **MOrErr, CombinedIndex,		return thinBackend(Conf, Task, AddStream, **MOrErr, CombinedIndex,
ImportList, DefinedGlobals, ModuleMap);		ImportList, DefinedGlobals, ModuleMap);
};		};

auto ModuleID = MBRef.getBufferIdentifier();		auto ModuleID = BM.getModuleIdentifier();

if (!Cache \|\| !CombinedIndex.modulePaths().count(ModuleID) \|\|		if (!Cache \|\| !CombinedIndex.modulePaths().count(ModuleID) \|\|
all_of(CombinedIndex.getModuleHash(ModuleID),		all_of(CombinedIndex.getModuleHash(ModuleID),
[](uint32_t V) { return V == 0; }))		[](uint32_t V) { return V == 0; }))
// Cache disabled or no entry for this module in the combined index or		// Cache disabled or no entry for this module in the combined index or
// no module hash.		// no module hash.
return RunThinBackend(AddStream);		return RunThinBackend(AddStream);

SmallString<40> Key;		SmallString<40> Key;
// The module may be cached, this helps handling it.		// The module may be cached, this helps handling it.
computeCacheKey(Key, CombinedIndex, ModuleID, ImportList, ExportList,		computeCacheKey(Key, CombinedIndex, ModuleID, ImportList, ExportList,
ResolvedODR, DefinedGlobals);		ResolvedODR, DefinedGlobals);
if (AddStreamFn CacheAddStream = Cache(Task, Key))		if (AddStreamFn CacheAddStream = Cache(Task, Key))
return RunThinBackend(CacheAddStream);		return RunThinBackend(CacheAddStream);

return Error::success();		return Error::success();
}		}

Error start(		Error start(
unsigned Task, MemoryBufferRef MBRef,		unsigned Task, BitcodeModule BM,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const FunctionImporter::ExportSetTy &ExportList,		const FunctionImporter::ExportSetTy &ExportList,
const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,		const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) override {		MapVector<StringRef, BitcodeModule> &ModuleMap) override {
StringRef ModulePath = MBRef.getBufferIdentifier();		StringRef ModulePath = BM.getModuleIdentifier();
assert(ModuleToDefinedGVSummaries.count(ModulePath));		assert(ModuleToDefinedGVSummaries.count(ModulePath));
const GVSummaryMapTy &DefinedGlobals =		const GVSummaryMapTy &DefinedGlobals =
ModuleToDefinedGVSummaries.find(ModulePath)->second;		ModuleToDefinedGVSummaries.find(ModulePath)->second;
BackendThreadPool.async(		BackendThreadPool.async(
[=](MemoryBufferRef MBRef, ModuleSummaryIndex &CombinedIndex,		[=](BitcodeModule BM, ModuleSummaryIndex &CombinedIndex,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const FunctionImporter::ExportSetTy &ExportList,		const FunctionImporter::ExportSetTy &ExportList,
const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes>		const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes>
&ResolvedODR,		&ResolvedODR,
const GVSummaryMapTy &DefinedGlobals,		const GVSummaryMapTy &DefinedGlobals,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) {		MapVector<StringRef, BitcodeModule> &ModuleMap) {
Error E = runThinLTOBackendThread(		Error E = runThinLTOBackendThread(
AddStream, Cache, Task, MBRef, CombinedIndex, ImportList,		AddStream, Cache, Task, BM, CombinedIndex, ImportList,
ExportList, ResolvedODR, DefinedGlobals, ModuleMap);		ExportList, ResolvedODR, DefinedGlobals, ModuleMap);
if (E) {		if (E) {
std::unique_lock<std::mutex> L(ErrMu);		std::unique_lock<std::mutex> L(ErrMu);
if (Err)		if (Err)
Err = joinErrors(std::move(*Err), std::move(E));		Err = joinErrors(std::move(*Err), std::move(E));
else		else
Err = std::move(E);		Err = std::move(E);
}		}
},		},
MBRef, std::ref(CombinedIndex), std::ref(ImportList),		BM, std::ref(CombinedIndex), std::ref(ImportList),
std::ref(ExportList), std::ref(ResolvedODR), std::ref(DefinedGlobals),		std::ref(ExportList), std::ref(ResolvedODR), std::ref(DefinedGlobals),
std::ref(ModuleMap));		std::ref(ModuleMap));
return Error::success();		return Error::success();
}		}

Error wait() override {		Error wait() override {
BackendThreadPool.wait();		BackendThreadPool.wait();
if (Err)		if (Err)
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	WriteIndexesThinBackend(
std::string OldPrefix, std::string NewPrefix, bool ShouldEmitImportsFiles,		std::string OldPrefix, std::string NewPrefix, bool ShouldEmitImportsFiles,
std::string LinkedObjectsFileName)		std::string LinkedObjectsFileName)
: ThinBackendProc(Conf, CombinedIndex, ModuleToDefinedGVSummaries),		: ThinBackendProc(Conf, CombinedIndex, ModuleToDefinedGVSummaries),
OldPrefix(OldPrefix), NewPrefix(NewPrefix),		OldPrefix(OldPrefix), NewPrefix(NewPrefix),
ShouldEmitImportsFiles(ShouldEmitImportsFiles),		ShouldEmitImportsFiles(ShouldEmitImportsFiles),
LinkedObjectsFileName(LinkedObjectsFileName) {}		LinkedObjectsFileName(LinkedObjectsFileName) {}

Error start(		Error start(
unsigned Task, MemoryBufferRef MBRef,		unsigned Task, BitcodeModule BM,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const FunctionImporter::ExportSetTy &ExportList,		const FunctionImporter::ExportSetTy &ExportList,
const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,		const std::map<GlobalValue::GUID, GlobalValue::LinkageTypes> &ResolvedODR,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) override {		MapVector<StringRef, BitcodeModule> &ModuleMap) override {
StringRef ModulePath = MBRef.getBufferIdentifier();		StringRef ModulePath = BM.getModuleIdentifier();
std::string NewModulePath =		std::string NewModulePath =
getThinLTOOutputFile(ModulePath, OldPrefix, NewPrefix);		getThinLTOOutputFile(ModulePath, OldPrefix, NewPrefix);

std::error_code EC;		std::error_code EC;
if (!LinkedObjectsFileName.empty()) {		if (!LinkedObjectsFileName.empty()) {
if (!LinkedObjectsFile) {		if (!LinkedObjectsFile) {
LinkedObjectsFile = llvm::make_unique<raw_fd_ostream>(		LinkedObjectsFile = llvm::make_unique<raw_fd_ostream>(
LinkedObjectsFileName, EC, sys::fs::OpenFlags::F_None);		LinkedObjectsFileName, EC, sys::fs::OpenFlags::F_None);
▲ Show 20 Lines • Show All 128 Lines • Show Last 20 Lines

llvm/lib/LTO/LTOBackend.cpp

Show First 20 Lines • Show All 312 Lines • ▼ Show 20 Lines	Error lto::backend(Config &C, AddStreamFn AddStream,
}		}
return Error::success();		return Error::success();
}		}

Error lto::thinBackend(Config &Conf, unsigned Task, AddStreamFn AddStream,		Error lto::thinBackend(Config &Conf, unsigned Task, AddStreamFn AddStream,
Module &Mod, ModuleSummaryIndex &CombinedIndex,		Module &Mod, ModuleSummaryIndex &CombinedIndex,
const FunctionImporter::ImportMapTy &ImportList,		const FunctionImporter::ImportMapTy &ImportList,
const GVSummaryMapTy &DefinedGlobals,		const GVSummaryMapTy &DefinedGlobals,
MapVector<StringRef, MemoryBufferRef> &ModuleMap) {		MapVector<StringRef, BitcodeModule> &ModuleMap) {
Expected<const Target *> TOrErr = initAndLookupTarget(Conf, Mod);		Expected<const Target *> TOrErr = initAndLookupTarget(Conf, Mod);
if (!TOrErr)		if (!TOrErr)
return TOrErr.takeError();		return TOrErr.takeError();

std::unique_ptr<TargetMachine> TM =		std::unique_ptr<TargetMachine> TM =
createTargetMachine(Conf, Mod.getTargetTriple(), *TOrErr);		createTargetMachine(Conf, Mod.getTargetTriple(), *TOrErr);

handleAsmUndefinedRefs(Mod, *TM);		handleAsmUndefinedRefs(Mod, *TM);
Show All 18 Lines	Error lto::thinBackend(Config &Conf, unsigned Task, AddStreamFn AddStream,

if (Conf.PostInternalizeModuleHook &&		if (Conf.PostInternalizeModuleHook &&
!Conf.PostInternalizeModuleHook(Task, Mod))		!Conf.PostInternalizeModuleHook(Task, Mod))
return Error::success();		return Error::success();

auto ModuleLoader = [&](StringRef Identifier) {		auto ModuleLoader = [&](StringRef Identifier) {
assert(Mod.getContext().isODRUniquingDebugTypes() &&		assert(Mod.getContext().isODRUniquingDebugTypes() &&
"ODR Type uniquing should be enabled on the context");		"ODR Type uniquing should be enabled on the context");
return getLazyBitcodeModule(ModuleMap[Identifier], Mod.getContext(),		auto I = ModuleMap.find(Identifier);
		assert(I != ModuleMap.end());
		return I->second.getLazyModule(Mod.getContext(),
/ShouldLazyLoadMetadata=/true);		/ShouldLazyLoadMetadata=/true);
};		};

FunctionImporter Importer(CombinedIndex, ModuleLoader);		FunctionImporter Importer(CombinedIndex, ModuleLoader);
if (Error Err = Importer.importFunctions(Mod, ImportList).takeError())		if (Error Err = Importer.importFunctions(Mod, ImportList).takeError())
return Err;		return Err;

if (Conf.PostImportModuleHook && !Conf.PostImportModuleHook(Task, Mod))		if (Conf.PostImportModuleHook && !Conf.PostImportModuleHook(Task, Mod))
return Error::success();		return Error::success();

if (!opt(Conf, TM.get(), Task, Mod, /IsThinLTO=/true))		if (!opt(Conf, TM.get(), Task, Mod, /IsThinLTO=/true))
return Error::success();		return Error::success();

codegen(Conf, TM.get(), AddStream, Task, Mod);		codegen(Conf, TM.get(), AddStream, Task, Mod);
return Error::success();		return Error::success();
}		}

llvm/test/LTO/Resolution/X86/mixed_lto.ll

	; Test mixed-mode LTO (mix of regular and thin LTO objects)			; Test mixed-mode LTO (mix of regular and thin LTO objects)
	; RUN: opt %s -o %t1.o			; RUN: opt %s -o %t1.o
	; RUN: opt -module-summary %p/Inputs/mixed_lto.ll -o %t2.o			; RUN: opt -module-summary %p/Inputs/mixed_lto.ll -o %t2.o

	; RUN: llvm-lto2 -o %t3.o %t2.o %t1.o -r %t2.o,main,px -r %t2.o,g, -r %t1.o,g,px			; RUN: llvm-lto2 -o %t3.o %t2.o %t1.o -r %t2.o,main,px -r %t2.o,g, -r %t1.o,g,px

	; Task 0 is the regular LTO file (this file)			; Task 0 is the regular LTO file (this file)
	; RUN: llvm-nm %t3.o.0 \| FileCheck %s --check-prefix=NM0			; RUN: llvm-nm %t3.o.0 \| FileCheck %s --check-prefix=NM0
	; NM0: T g			; NM0: T g

	; Task 1 is the (first) ThinLTO file (Inputs/mixed_lto.ll)			; Task 1 is the (first) ThinLTO file (Inputs/mixed_lto.ll)
	; RUN: llvm-nm %t3.o.1 \| FileCheck %s --check-prefix=NM1			; RUN: llvm-nm %t3.o.1 \| FileCheck %s --check-prefix=NM1
	; NM1-DAG: T main			; NM1-DAG: T main
	; NM1-DAG: U g			; NM1-DAG: U g

				; Do the same test again, but with the regular and thin LTO modules in the same file.
				; RUN: llvm-cat -b -o %t4.o %t2.o %t1.o
				; RUN: llvm-lto2 -o %t5.o %t4.o -r %t4.o,main,px -r %t4.o,g, -r %t4.o,g,px
				; RUN: llvm-nm %t5.o.0 \| FileCheck %s --check-prefix=NM0
				; RUN: llvm-nm %t5.o.1 \| FileCheck %s --check-prefix=NM1

	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"
	define i32 @g() {			define i32 @g() {
	ret i32 0			ret i32 0
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

LTO: Add support for multi-module bitcode files.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 79959

clang/lib/CodeGen/BackendUtil.cpp

llvm/include/llvm/Bitcode/BitcodeReader.h

llvm/include/llvm/LTO/LTO.h

llvm/include/llvm/LTO/LTOBackend.h

llvm/lib/LTO/LTO.cpp

llvm/lib/LTO/LTOBackend.cpp

llvm/test/LTO/Resolution/X86/mixed_lto.ll

LTO: Add support for multi-module bitcode files.
ClosedPublic