This is an archive of the discontinued LLVM Phabricator instance.

factor the "create a symbol table from a module" code out of IRObjectFile and into a new ModuleSymbolTable class, which would conceptually be responsible for maintaining the mapping between symbol table entries and GlobalValues. This would be similar to what I proposed in D23132, but at least to begin with it would just be a straight refactoring of the post- D26928 IRObjectFile code
use that class from IRObjectFile
change lib/LTO to use ModuleSymbolTable directly
remove getModule, takeModule, getSymbolGV interfaces from IRObjectFile

When we get around to implementing bitcode symbol tables we can:

implement the bitcode symbol table writer in terms of ModuleSymbolTable
change IRObjectFile to read from the bitcode symbol table instead of using ModuleSymbolTable directly

I will start working on ModuleSymbolTable.

An important point which I forgot to mention: the symbol table stored by ModuleSymbolTable would correspond to any number of modules (all of the same target triple).

In D26951#603439, @pcc wrote:

An important point which I forgot to mention: the symbol table stored by ModuleSymbolTable would correspond to any number of modules (all of the same target triple).

That seems an arbitrary choice, that is only driven by the current use case of splitting vtables for LTO.

In D26951#603447, @mehdi_amini wrote:

In D26951#603439, @pcc wrote:

An important point which I forgot to mention: the symbol table stored by ModuleSymbolTable would correspond to any number of modules (all of the same target triple).

That seems an arbitrary choice, that is only driven by the current use case of splitting vtables for LTO.

I think in general there are two possible cases:

where by design the client needs to have multiple conceptual "views" into the input file (e.g. fat binaries, CUDA, OpenMP)
where the client has a single "view" and does not care about which symbol is defined in which module (e.g. regular/thin LTO splitting)

The client's use of ModuleSymbolTable (and the rest of the lib/Object interface in general) needs to be driven by that fundamental design decision of where the split lies. So for the fat binary scenario I would see the client creating one ModuleSymbolTable (and one bitcode symbol table) for each architecture, and the IRObjectFile growing a way to choose the architecture (as we do in MachOObjectFile for example).

The client's use of ModuleSymbolTable (and the rest of the lib/Object interface in general) needs to be driven by that fundamental design decision of where the split lies. So for the fat binary scenario I would see the client creating one ModuleSymbolTable (and one bitcode symbol table) for each architecture, and the IRObjectFile growing a way to choose the architecture (as we do in MachOObjectFile for example).

Right, but the "triple" as a discriminator seems arbitrary to me: what about use cases where we ship a "fat" object file containing bitcode for a non-optimized debug build of the module and an optimized one? Or for building with and without options like freestanding? Or with and without the sanitizers?

I'm fine with being pragmatic and making it work for CFI in LTO, I just want to make sure that the API and the design of the ModuleSymbolTable / IRObjectFile relationship does not make too many assumptions about it.

In D26951#603478, @mehdi_amini wrote:

Right, but the "triple" as a discriminator seems arbitrary to me: what about use cases where we ship a "fat" object file containing bitcode for a non-optimized debug build of the module and an optimized one? Or for building with and without options like freestanding? Or with and without the sanitizers?

I think we're confusing a couple of things here. I am not saying that the triple would be the discriminator; the discriminator could in principle be anything the client wants (and could be chosen at BitcodeWriter time). The reason I mentioned that modules associated with a single ModuleSymbolTable should have the same triple is to ensure that the eventual object files are compatible and that name mangling happens consistently. I am not precluding having multiple ModuleSymbolTables whose modules happen to have the same target triple.

Sure, this part is fine, I think what tickled me was "the IRObjectFile growing a way to choose the architecture", which I read as "you pass in the architecture and it get all the right modules from the bitcode file and get a ModuleSymbolTable for them".

In D26951#603493, @mehdi_amini wrote:

Sure, this part is fine, I think what tickled me was "the IRObjectFile growing a way to choose the architecture", which I read as "you pass in the architecture and it get all the right modules from the bitcode file and get a ModuleSymbolTable for them".

Right, "architecture" was just an example here, we can make a better decision about what exactly the discriminator should be when the time comes to implement a feature that depends on it.

Depends on D27079

pcc removed parent revisions: D26778: Add llvm-modextract tool., D26928: Object: Simplify the IRObjectFile symbol iterator implementation..Nov 23 2016, 6:32 PM

pcc updated this object.

LGTM.

llvm/include/llvm/Object/IRObjectFile.h
31 ↗	(On Diff #79182)	I rather avoid the ultra-contraction, what about: `Mods`

This revision is now accepted and ready to land.Nov 23 2016, 9:41 PM

pcc mentioned this in D27073: Object: Extract a ModuleSymbolTable class from IRObjectFile..Nov 28 2016, 11:57 AM

Address review comments; pick up tool rename

pcc added a child revision: D27313: LTO: Add support for multi-module bitcode files..Dec 1 2016, 11:36 AM

Closed by commit rL289578: Object: Make IRObjectFile own multiple modules and enumerate symbols from all… (authored by pcc). · Explain WhyDec 13 2016, 12:30 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

Object/

IRObjectFile.h

5 lines

lib/

Object/

IRObjectFile.cpp

39 lines

test/

Object/

Inputs/

multi-module.ll

3 lines

multi-module.ll

8 lines

Diff 81281

llvm/trunk/include/llvm/Object/IRObjectFile.h

	Show All 22 Lines
	class Module;			class Module;
	class GlobalValue;			class GlobalValue;
	class Triple;			class Triple;

	namespace object {			namespace object {
	class ObjectFile;			class ObjectFile;

	class IRObjectFile : public SymbolicFile {			class IRObjectFile : public SymbolicFile {
	std::unique_ptr<Module> M;			std::vector<std::unique_ptr<Module>> Mods;
	ModuleSymbolTable SymTab;			ModuleSymbolTable SymTab;
	IRObjectFile(MemoryBufferRef Object, std::unique_ptr<Module> M);			IRObjectFile(MemoryBufferRef Object,
				std::vector<std::unique_ptr<Module>> Mods);

	public:			public:
	~IRObjectFile() override;			~IRObjectFile() override;
	void moveSymbolNext(DataRefImpl &Symb) const override;			void moveSymbolNext(DataRefImpl &Symb) const override;
	std::error_code printSymbolName(raw_ostream &OS,			std::error_code printSymbolName(raw_ostream &OS,
	DataRefImpl Symb) const override;			DataRefImpl Symb) const override;
	uint32_t getSymbolFlags(DataRefImpl Symb) const override;			uint32_t getSymbolFlags(DataRefImpl Symb) const override;
	basic_symbol_iterator symbol_begin() const override;			basic_symbol_iterator symbol_begin() const override;
	Show All 25 Lines

llvm/trunk/lib/Object/IRObjectFile.cpp

Show All 29 Lines
#include "llvm/Object/ObjectFile.h"		#include "llvm/Object/ObjectFile.h"
#include "llvm/Support/MemoryBuffer.h"		#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/SourceMgr.h"		#include "llvm/Support/SourceMgr.h"
#include "llvm/Support/TargetRegistry.h"		#include "llvm/Support/TargetRegistry.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
using namespace llvm;		using namespace llvm;
using namespace object;		using namespace object;

IRObjectFile::IRObjectFile(MemoryBufferRef Object, std::unique_ptr<Module> Mod)		IRObjectFile::IRObjectFile(MemoryBufferRef Object,
: SymbolicFile(Binary::ID_IR, Object), M(std::move(Mod)) {		std::vector<std::unique_ptr<Module>> Mods)
		: SymbolicFile(Binary::ID_IR, Object), Mods(std::move(Mods)) {
		for (auto &M : this->Mods)
SymTab.addModule(M.get());		SymTab.addModule(M.get());
}		}

IRObjectFile::~IRObjectFile() {}		IRObjectFile::~IRObjectFile() {}

static ModuleSymbolTable::Symbol getSym(DataRefImpl &Symb) {		static ModuleSymbolTable::Symbol getSym(DataRefImpl &Symb) {
return reinterpret_cast<ModuleSymbolTable::Symbol >(Symb.p);		return reinterpret_cast<ModuleSymbolTable::Symbol >(Symb.p);
}		}

Show All 19 Lines

basic_symbol_iterator IRObjectFile::symbol_end() const {		basic_symbol_iterator IRObjectFile::symbol_end() const {
DataRefImpl Ret;		DataRefImpl Ret;
Ret.p = reinterpret_cast<uintptr_t>(SymTab.symbols().data() +		Ret.p = reinterpret_cast<uintptr_t>(SymTab.symbols().data() +
SymTab.symbols().size());		SymTab.symbols().size());
return basic_symbol_iterator(BasicSymbolRef(Ret, this));		return basic_symbol_iterator(BasicSymbolRef(Ret, this));
}		}

StringRef IRObjectFile::getTargetTriple() const { return M->getTargetTriple(); }		StringRef IRObjectFile::getTargetTriple() const {
		// Each module must have the same target triple, so we arbitrarily access the
		// first one.
		return Mods[0]->getTargetTriple();
		}

ErrorOr<MemoryBufferRef> IRObjectFile::findBitcodeInObject(const ObjectFile &Obj) {		ErrorOr<MemoryBufferRef> IRObjectFile::findBitcodeInObject(const ObjectFile &Obj) {
for (const SectionRef &Sec : Obj.sections()) {		for (const SectionRef &Sec : Obj.sections()) {
if (Sec.isBitcode()) {		if (Sec.isBitcode()) {
StringRef SecContents;		StringRef SecContents;
if (std::error_code EC = Sec.getContents(SecContents))		if (std::error_code EC = Sec.getContents(SecContents))
return EC;		return EC;
return MemoryBufferRef(SecContents, Obj.getFileName());		return MemoryBufferRef(SecContents, Obj.getFileName());
Show All 18 Lines	case sys::fs::file_magic::coff_object: {
return findBitcodeInObject(*ObjFile->get());		return findBitcodeInObject(*ObjFile->get());
}		}
default:		default:
return object_error::invalid_file_type;		return object_error::invalid_file_type;
}		}
}		}

Expected<std::unique_ptr<IRObjectFile>>		Expected<std::unique_ptr<IRObjectFile>>
llvm::object::IRObjectFile::create(MemoryBufferRef Object,		IRObjectFile::create(MemoryBufferRef Object, LLVMContext &Context) {
LLVMContext &Context) {
ErrorOr<MemoryBufferRef> BCOrErr = findBitcodeInMemBuffer(Object);		ErrorOr<MemoryBufferRef> BCOrErr = findBitcodeInMemBuffer(Object);
if (!BCOrErr)		if (!BCOrErr)
return errorCodeToError(BCOrErr.getError());		return errorCodeToError(BCOrErr.getError());

		Expected<std::vector<BitcodeModule>> BMsOrErr =
		getBitcodeModuleList(*BCOrErr);
		if (!BMsOrErr)
		return BMsOrErr.takeError();

		std::vector<std::unique_ptr<Module>> Mods;
		for (auto BM : *BMsOrErr) {
Expected<std::unique_ptr<Module>> MOrErr =		Expected<std::unique_ptr<Module>> MOrErr =
getLazyBitcodeModule(*BCOrErr, Context,		BM.getLazyModule(Context, /ShouldLazyLoadMetadata/ true);
/ShouldLazyLoadMetadata/ true);
if (!MOrErr)		if (!MOrErr)
return MOrErr.takeError();		return MOrErr.takeError();

std::unique_ptr<Module> &M = MOrErr.get();		Mods.push_back(std::move(*MOrErr));
		}

return std::unique_ptr<IRObjectFile>(		return std::unique_ptr<IRObjectFile>(
new IRObjectFile(*BCOrErr, std::move(M)));		new IRObjectFile(*BCOrErr, std::move(Mods)));
}		}

llvm/trunk/test/Object/Inputs/multi-module.ll

				define void @f2() {
				ret void
				}

llvm/trunk/test/Object/multi-module.ll

				; RUN: llvm-cat -o - %s %S/Inputs/multi-module.ll \| llvm-nm - \| FileCheck %s

				; CHECK: T f1
				; CHECK: T f2

				define void @f1() {
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

Object: Make IRObjectFile own multiple modules and enumerate symbols from all modules.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 81281

llvm/trunk/include/llvm/Object/IRObjectFile.h

llvm/trunk/lib/Object/IRObjectFile.cpp

llvm/trunk/test/Object/Inputs/multi-module.ll

llvm/trunk/test/Object/multi-module.ll

Object: Make IRObjectFile own multiple modules and enumerate symbols from all modules.
ClosedPublic